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ABSTRACT 

A recent area of interest in theoretical computer science has been in the construction 
of so-called pseudo-random bit generators. These generators "stretch" a short sequence 
of truly random bits into a longer sequence of "pseudo-random" bits. These bits 
are sufficiently indistinguishable from truly random bits to be useful in deterministic 
simulation of probabilistic computation. 

Let us say, informally, that a function is one-way if it can be computed in 
polynomial time but no family of polynomial-size circuits can invert it with high 
probability. Yao [Y] has recently proven that if any such function exists, then it can be 
used to construct a pseudo-random bit generator. Furthermore, the existence of this 
generator implies that R C floo DTIME(2 nt ). 

No proofs of the results have previously appeared in print. In this thesis, we present 
proofs of these results. In addition, we consider two other types of one-way function. 
The first type is much weaker than Yao's and we show that if such a function exists, 
it can be used to generate a somewhat less powerful pseudo-random bit generator. We 
then consider a second, much stronger type of one-way function and show that if such 
a function exists, then pseudo-random bit generators can be constructed which imply 

jeep. 
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Chapter 1 
Introduction 



A major goal of theoretical computer science is to find "effluent" a go „ hms for 
solving various problems. In this context, "efficient" generally means that he algorithm 
runs in time bounded by some polynomial in the length of the problem instance^ 
Unfortunately, finding such algorithms has so far been very difficult to accomph h 
in the case of many practical problems. Equally useful to the theoretician although 
perhaps not the programmer, is a proof that no efficient algonthm exists for a given 
problem. Success in this area has been even less common. 

The theory of NP-completeness has provided an alternative in the latter case. 
Hundreds of practical problems have been show to be NP-complete and this is 
generally taken as pervasive evidence that they are intractable. 

Recently, an alternative to finding provably efficient methods for solving given 
problems has also been considered. Instead of ^/% ^*™ ^^^ 
give the correct solution to a given problem, one tries to find algorithm, which give 
L correct solution with a very high probability. These "random" or "probabilistic 
methods have been the focus of much research. 

A probabilistic algorithm relies on coin flips in order to make certain decisions 
and its behavior on any given input is therefore not deterministic. Adleman[A] was 
Z first to explicitly define the class * ("random polynomial time") of those problem 
efficiently solvable by probabilistic algorithms. Informally, a problem is in Z if there is 
a polynomial time algorithm which solves it with only a very small chance , of ™*-™> 
chance that a solution given by such an algorithm is t ncorrect is jus the probability 
that some unlikely sequence of coin flips is produced during its execution. 

Note that any sequence of K coin flips is easily represented as a sequence of K 
bits Thus, any probabilistic computation can be simulated by a deterministic machine 
which simply runs the probabilistic computation over and over again, trying all possible 
Tit sequences of coin flips. Any polynomial time random algorithm can therefore be 
simulated deterministically in exponential time. 

Unfortunately, problems which require exponential time to solve are beyond the 
reach of even today's fastest computers. As a result, effort has been focused on methods 
for "stretching" a short sequence of truly random bits generated by com flips into a 
longer sequence of "pseudo-random" bits, without performing any additional coin flip.. 
Ideally, such -pseudo- random" bit sequences should be sufficently indistinguishable 
from truly random sequences to be useful in simulating probabilistic compuations. 



So far every such pseudo- random bit generator which has been proposed is based 
on an unproved assumption that some given problem is intractable. Blum and Micah 
were the first to demonstrate such a generator. Theirs is based on the assumed difficulty 
of solving the so-called "discrete logarithm problem". Subsequent generators have been 
based on the difficulty of factoring [G,GMT,Y] and the quadratic residuosity problem 
[BBS]. 

Recently Yao [Y] has proven a much more general theorem concerning pseudo- 
random bit generators. Let us say, informally, that a function is one-way if it can be 
computed in polynomial time, but no family of polynomial-sized circuits can invert 
it with high probability. Yao states that if any such one-way function exists, it can 
be used to construct a pseudo-random bit generator. Furthermore, the sequences 
produced by such a generator are indeed useful for simulating general probabilistic 
computation;, they can be used to simulate any polynomial time probabilistic algorithm 
deterministically in sub- exponential time. It remains as a major open problem to 
demonstrate that any such one-way function actually exists. 

This thesis presents proofs of Yao's results in Chapter 2. Proofs of the results 
have not previously appeared in print. In Chapter 3, we consider two other types 
of one-way functions. The first type, which is weaker than those of Yao s, cannot 
be inverted with high probability by any circuit of some fixed polynomial size. We 
show that if such a function exists, a somewhat less powerful pseudo-random bit 
generator can be constructed. We then consider a second type of one-way function 
which is very much stronger: no circuit of some fixed sub- exponential size can invert 
it without being mistaken frequently. We then show that if such a function exists it 
is possible to construct a bit generator which can be used to simulate any polynomial 
time probabilistic algorithm deterministically in polynomial time. 



Chapter 2 
Yao's Theorems 



In this chapter, we discuss conditions under which good pseudo-random bit 
generators can be constructed. We will then show that such generators can be used 
to simulate probabilistic computations deterministically. The results are all due to 

Yao[Y]. 

2.1 Building Generators From One- Way Functions 

Suppose G is a deterministic program which, given some fc-bit sequence z (a 
"seed") as input, outputs some bit sequence 61, 6 2 , . . . , &/>(*) where P{k) is a polynomial. 
In order for G to serve our purposes as a useful pseudo-random bit generator we will 
require the following: 

(1) G is efficient. The sequence 6 lf b 2 , . . . , ip(jt) is output in time polynomial in k. 

(2) The output of G is unpredictable. Given the generator G, and the first t 
output bits b\, 62, . . . , b{ generated from some seed x but not the seed x itself, it is 
computationally infeasible to predict the i + 1st bit in the sequence with a better 
than 50-50 chance. 

This definition was first proposed by Blum and Micali [BM]. 

Let us make the notion of "unpredictability" mentioned in condition (2) above 
more formal: 

Definition Let P be a polynomial, S k a multiset consisting of P(fc)-long bit sequences 
and S = Ujt S k . A polynomial-size next-bit test for 5 is a family of circuits C = {C\}. 
Each circuit C\ has t Boolean inputs where 1 < P{k), one Boolean output and size 
polynomial in fc. On input the first i bits of a sequence s randomly selected from S k , 
C\ will output a bit b. Let p£ t - denote the probability that b = the i + 1st bit of s. We 
say that 5 passes the test C if for every polynomial Q, all sufficiently large k and all 
i<P(k): i i 

p & < 2 + WY 



We will refer to both C and C l k as next-bit tests. 

Now we can state condition (2) of our definition of pseudo-random bit generators 
as follows: 



(2) Let S k be the multiset of sequences output by G on all /c-bit seeds. Then S k 
passes all polynomial-size next-bit tests. (Note that S k may be a multiset since 
two different seeds might cause G to output the same sequence.) 

We formally define a Cryptographically strong pseudo-random bit generator 
following Blum and Micali[BM]: 

Definition Let P be a polynomial, I k the set of all strings of length A;, and D k C I k 
a set of inputs (or "seeds") of length k. Let G be a deterministic algorithm which, 
on input a seed i G D k , outputs a P(A;)-long bit sequence s x in Poly{k) time. Let 
S k — {s x \x e Dk}- The algorithm G is a Cryptographically strong pseudo-random bit 
generator (a P-CSB generator) if the multiset 5 = Ufc S k passes all next-bit tests. 

Every explicit CSB generator which has so far been proposed is based on an 
unproved assumption that some given problem is intractable. Blum and Micali [BM] 
were the first to show such a generator. They based it on the assumed difficulty of 
solving the discrete logarithm problem. For further details on the discrete logarithm 
problem, see [AL]. 

Definition Let p be a prime. The set of integers [l,p-l] forms a cyclic group Z p under 
multiplication mod p. Given a prime p, a generator g for Z p , and y 6 Z p} the Discrete 
Logarithm Problem (DLP) is to find the unique x € Z* p such that y = g x (mod p). This 
i is often denoted indeXp tg (y). 

Let C = {C n } be a family of circuits such that C n has 3n Boolean inputs and 
size P[n) for some polynomial P. Think of these inputs as consisting of an n-bit 
prime p, an n-bit generator g for Z* p , and an n-bit y £ Z p . Blum and Micali are able 
to construct a CSB generator under the (unproved) assumption that for every such 
family of circuits C = {C„} and all polynomials P and Q: for all sufficiently large 
n, C n {p, g, y) ^ indeXpM for at least a fraction 1/Q{k) of the n-bit primes p. Some 
subsequent generators have been based on the difficulty of factoring [G,GMT,Y] and 
the quadratic residuosity problem [BBS]. 

Although the Discrete Logarithm Problem seems difficult, the inverse of the DLP 
is easily solved in polynomial time. Given a prime p, a generator g for Z p , and i € Z p 
the function POWER(g,x,p) = g x (mod p) can be calculated by successive squaring 
in time polynomial in the length of g, x, and p. Thus POWER can be thought of as a 
"one-way" function-easy to compute but difficult to invert. Yao[Y] has subsequently 
shown that given any "one-way" function, it is possible to construct a CSB generator. 
Let us make this notion of "one-way" more precise. 

Definition(Yao[Y]) Let I k be the set of all Ac-bit strings, let D k C I k and let 
f k :D k *-+ D k be a sequence of permutations. We will write D = \JD k and / = {f k }. 
Then, / is a weak one-way function if the following properties are satisfied: 

(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input k, selects an x G D k with uniform probability. 

(2) There exists a polynomial-time algorithm which, on inputs k and z £ D k , computes 



/. f~\ 



(3) There exists a polynomial Q such that the following holds. Let C — {C k } be 
any family of polynomial- size circuits where each C k has k inputs. Then for all 
sufficiently large k: 

C k {x) 7^ /fc ! (i) for at least a fraction ^r of the x G D k . 

We can now state an important result of Yao's precisely. 

Theorem 1 (Yao[Yj) Given any weak one-way function /, it is possible to construct 

a P-CSB generator for any polyomial P. 

Note the generality of this theorem. It says that given any weak one-way function, 
we can construct a generator which "stretches" a Ac-bit seed into a P(fc)-bit pseudo- 
random sequence, for any polynomial P. Thus we can decide beforehand, up to a 
polynomial, how much "stretching" we want our generator to do. 

Note also one slight difference between the difficulty assumed about inverting 
a general one-way function and the difficulty which Blum and Micali assume about 
inverting the POWER function. In the general definition, one assumes that there exists 
some polynomial Q such that any family of polynomial-size circuits will fail to invert 
a one-way function, with probability at least 1/Q{k). Blum and Micali assume that 
this holds for every polynomial Q. It is, of course, possible to weaken the Blum-Micali 
assumption to conform with the more general case. The resulting generator is somewhat 
less efficient than that in the Blum-Micali paper. 

Proof of Theorem 1 We first show the following lemma, due to Blum and Micali[BM], 
which provides a set of sufficient conditions for constructing CSB generators. We then 
show that, given any weak one-way function, it is possible to satisfy these conditions. 
Lemma l.l(Blum,Micali[BM]) Let I k be the set of fc-bit strings and let D k C I k . 
Let g k :D k i-> D k be a sequence of permutations and let B k :D k i-» {0, 1} be a sequence 
of predicates. We will write D = U fc D k , g = {<?*} and B = {B k }. If the following 
set of conditions hold, then it is possible to construct a P-CSB generator, for any 
polynomial P. 

(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input k, chooses x € D k with uniform probability. 

(2) There exists a polynomial- time algorithm which, on inputs k and i € D k , computes 

(3) There exists a polynomial-time algorithm which, on inputs k and i € D k , computes 
B k {g k {x)). 

(4) Let C = {C k } be any family of polynomial-size circuits such that each C k has k 
inputs and let Q be any polynomial. Then, for all sufficiently large A;: 

C k {x) ^ B k {x) for at least a fraction - - ^r of the i 6 D k . 

Proof of Lemma 1.1 First we construct the P-CSB generator and then prove that 
its outputs must pass all next- bit tests. 



Choose an appropriate value of k to be the seed length and choose in probabilistic 
polynomial time a random x 6 D k to be used as the seed. Set c = P(k), the desired 
length of the output sequence, and generate the bits: 

B k {g k (x)),B k {g 2 k (x)),...,B k (g e k (x)). 

The notation g[ indicates the j'-fold composition of g k . Now, output these bits in 

reverse order, i.e.: 

BM[x)),B k { ! gX- 1 {x)) t ...,B i { 9k {x)). 

It should be clear that all of this can be accomplished in polynomial time by conditions 
(2) and (3). 

It remains to show that the sequences output by this generator pass all next-bit 
tests. Suppose that this is not true. Then there exists a polynomial Q and a family of 
Poly{k) size circuits C = {C\} where each C\ has i < P[k) inputs and the following 
holds. For each of infinitely many values of k there exists some i such that: 

_<7 - 1 1 



P C k,i > o + 



2 T Q{k) 

where pf t - is the probability that the circuit C\ outputs the correct i + 1st bit of a 
sequence when given the first i bits as input. 

We now construct a polynomial- size family of circuits A — {A k } where each A k 
has k inputs and such that for infinitely many values of k, A k correctly computes the 
predicate B k {x) with probability at least \ + ^. This contradicts condition (4). 

Choose one of the infinitely many values of k such that for some i < P(k): 

P *- f - 2 + Q(k)' 

The circuit A k uses C{ as a "subroutine". On input x 6 D k , the circuit A k first 
generates the i-bit sequence: 

B k (gi(x)),B k (gi-\x)),...,B k (g k (x)) 

and inputs this sequence to the next-bit test circuit C\. Note that this can be 
accomplished with a polynomial number of circuit gates by conditions (2) and (3). The 
circuit A k then outputs whatever value C\ outputs on these bits. 

Note that the bits: 

B k (gi(x)),B k (g k -\x)),...,B k (g k (x)) 

are the first i bits of the CSB sequence: 

B k {gi(x)), .... B k (g k (x)), B k (x), ..., B k ( g r(x)). 



10 



Since the i + 1st bit of this sequence is B k {x), the circuit A k will correctly compute 
S fc (i) whenever the circuit C\ correctly outputs the i + 1st bit. Furthermore, thejeed 
of this sequence is g\~ c ~ l {x) and since g k is, by assumption, a permutation, so is g\ c . 
Thus, g\~ c ~ l generates all possible seeds x G D k which means that the next-bit circuit 
C\ will correctly output the i + 1st bit of this particular sequence for a fraction at 
least J + ^L. Thus, the circuit A k computes B k [x) for a fraction at least § + Q(jfc) of 
the x G D k which contradicts condition (4). This completes the proof of the lemma. 

It remains to show that, given some arbitrary one-way function / = {/*} over an 
accessible domain E = [) k E k , it is possible to construct a new domain D = \Jk D k, 
a function g = {g k } and a predicate B = {B k } which satisfy the four conditions of 
lemma 1.1. This is proven in the following lemma due to Yao. 

Lemma 1.2(Yao[Y]) Let / = {/*} be a weak one-way function over an accessible 
domain E = \J k E k . Then, by definition, for each fc, given any x G D k it is possible to 
compute / fc (i) in polynomial time and there exists some constant d with the following 
property. Given any family of polynomial-size circuits C = {C k } where C k has k 
inputs, we have, for all sufficiently large k: 

C k (x) 7^ f\~ l {x) for at least a fraction -j of the x G D k . 

The following construction satisfies conditions (l)-(4) of lemma 1.1: 

(a) Set D k to be the cartesian product of ck copies of E k where c is a constant which 
depends on k and the constant d mentioned in the statement of this lemma. The 
exact value of c will be determined later in the proof. Suffice for now to say that 
it is polynomial function of A;. Formally: 

D k = { (xi, x 2 ,..., x ek ) | ii € E k , . . . , x ck G E k }. 

(b) Let g k { (n, x 2 , ... , x ek ) ) = ( /jk(ii), f k {x 2 ), • • • , fk{x e k) ) where each Xj G E k . 
(cl) Let B k (x) = the tth bit of f k 1 [x) where x G E k . 

(c2) Let 

k e 

B k { (Xl, X 2 , • • • , Xcifc) ) = © © B\{x e{i _ l)+j ) 

where each Xj G D k and "©" denotes the "exclusive- or" operation. 
Proof of Lemma 1.2 It should be fairly obvious that conditions (1), (2) and (3) of 
lemma 1.1 are satisfied by this construction. The domain D k is certainly accessible 
since, by assumption, E k is accessible. Given any {xi,x 2 ,...,x ek ) G E k , the function 
g k {(xi,X2,...,x ek )) can be computed in polynomial time since, by assumption, each 
f k (xj) is computable in polynomial time and, as mentioned, the constant c is polynomial 
in k. Finally, the predicate: 

B jfc (g i (xi,X2, • • • ,icfc)) = B k {{fk{x{],! k {x 2 ) /*(*<*))) 

= @ @ Bi(fk(x e (i-i) + j)) 



ii 



can easily be computed in polynomial time since for any i € D k : 

B\{ /*(*) ) = the tth bit of ft\fk[x)) 
= the tth bit of x 

The more difficult task is to prove that, as constructed, the predicate B k satisfies 
condition (4) of lemma 1.1. Let C = {C k } be a family of polynomial-size circuits 
where each circuit C k takes inputs from the domain D k constructed in (a) above. For 
every such family of circuits {C k } and every polynomial Q{k), we will show, for all 
sufficiently large k: 

C k { (xi, x 2 , . . . , Xdfc) ) 7^ B k{ (xi, x 2 , • • • , X c fc) ) 

for at least a fraction - - -^tt of the (ii, 12, • • • » x ck) € D k . 

Actually, we will choose to formulate this problem slightly differently. For every family 
of circuits {C k } and every polynomial Q(k), we will show, for all sufficiently large k: 

Prob[C fc ((ii,i2,...,x c jk)) = J Bfc((ii,i 2) ...,x cfc ))] < - + 



Q(k) 



where (xi, z 2 , • • • , x ek ) is a randomly chosen element of D k . This will be proven through 
a sequence of three lemmas. 

Note one small point. In order to be absolutely consistent, we should actually show 
that the probability of any circuit C k correctly solving the predicate B k is less than 
1 + —^. s i nC e the length of each element in D k is ck 2 bits. However, since Q{ck 2 ) is 

* Q(ck J ) 

polynomial in Q(k), it suffices to prove the result as stated. 

Lemma 1.2.1 Suppose that there exists a polynomial-size family of circuits C = {C k } 
where each circuit C k takes inputs from the domain D k and such that for all k and all 
[x 1 ,x 2 , ..., x c jt) G D k we have: 

C k { (xi, x 2 , . . • , x ck ) ) = B k ( (xi, X2, • • • 1 x c fc) ). 

Then, there exists a polynomial-size family of circuits A = {A k } where each A k has k 
Boolean inputs such that for all k and all x £ E k we have: 

A k (x) = f; l (x). 

This lemma just says that if we can easily solve the predicate B k on all inputs, 
then we can easily invert the assumed one-way function f k on all inputs. Although 
we do not actually need the lemma in order to show our desired result, its proof will 
shed some light on why the fact that f k is hard to invert implies that B k is difficult to 
compute. 
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Proof of Lemma 1,2.1 Let us examine the predicate B k a bit more closely: 

lb c 

B k ( (x 1( x 2 , .-.. Xdfc) ) = © M(^(t-l) + i) 

,=l y=i 

=^(x 1 )0 J Bj[(x 2 )e-"e^c) U) 

Bft2 e+ i) 5 2 fc (x c+2 ) • • • flj(xa.) (2) 

Bl{x 2c +i) S|(x 2c+ 2) • • • SJfcM (3) 



Bj(x (4 _ 1)e+1 ) B k k (x {k _ 1)e+2 ) • • ■ Bj(x fcc ). (*) 

Note that the predicate B\ is applied to inputs xi through x c (line (1)), B^ is 
applied to inputs x c+] through x 2c (line (2)), and so on, down to line (fc). Thus, 
B k { (xi, X2, . • • , x cfe ) ) is just a big exclusive-or of the first bit of f k (xy) for j = lto c, 
the second bit of /* ^xy) for j = c + 1 to 2c, and so on, down to the fcth bit of f k (xy) 
for j = (Jt - l)c + 1 to kc. 

So, suppose we are given some x E D k and wish to compute f k (x). We will 
calculate each bit of /^(x) separately. To get the first bit, choose some random 
elements x 2 £ E k , x z eE k ,..., x ek € E k and calculate u where: 

t* = C k { (x, /fc(x 2 ), /jfc(x 3 ), • • • , fk[*ck)) )• 

(Remember the circuit C k correctly computes the predicate B k on all inputs.) This is 
just a big exclusive-or of B\{x) (i.e., the first bit of f^x)) and a series of terms of the 
form £l(/ fc (xy)). Recall that B\{f k [xj)) is just the tth bit of xy, thus the exclusive-or 
of all of these latter terms can be computed in polynomial time. More formally, if the 
exclusive-or of all the B[(f k (xj)) terms is denoted v then: 

v = © BUMxj)) © © f?l(/ fc (x c(l _ 1)+ y)) 
y= 2 »=2 j=i 

e he 

= © (first bit of xy) © © (tth bit of x e( ,-_i)+y) 
y=2 »'=2 j-l 

and v is computable in polynomial time. Thus, if we exclusive-or u with v then 
everything is "cancelled out" with the exception of B l k (x) which is just the first bit of 
/^(x) as desired. Formally: 

«©« = B\{x) 

= the first bit of /^(x). 

Since C k is, by assumption, a polynomial-size circuit and v is polynomial- time 
computable, it is possible to get the first bit of fl\x) with a polynomial number of 
circuit gates. 
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It should be clear that the second bit of ft \x) can be computed similarly. Suppose 
we choose xi£E k ,...,x e e E k , z c+2 € E k , . . . , x ek £ E k and compute: 

c k ( (/rH*i). • • • . fk l (*«). *. /* W)> • • • » /* X M) )• 

This is just a big exclusive-or of B\{x) (the second bit of f k \x)) and a series of terms 
of the form #i(/i(xy)) each of which can be calculated in polynomial time. Thus, using 
the same trick as before, it is possible to "isolate out" the second bit of f k l (z) with 
a polynomial number of circuit gates. The remaining bits can be computed similarly. 
The following algorithm can easily be transformed into a polynomial- size circuit A k 
with k Boolean inputs such that for all x € E k we have: 

A k (x) = fl X (x). 

Algorithm Al 

input: x £E k 

output: y £E k such that y = f k 1 {x) 

(1) Set y = null 

(2) For h = 1 to k do 

(2.1) Choose Ii, 12. • • • ) x c{h-l)> x c(h-l)+2> • • - > x ck 
all elements of E k 

(2.2) Set u = C k [ {fk[xi), • ■ • . fk{x e {h-i))> *> fk[ x c(h-i)+2), • • • » fk{ x ck)) ) 

(2.3) Set v = ©J"} ©5 = i(«'th bit of i c ( t _i)+j) 

©y=2 (^ th bit of x c{h-i)+j) 
©f=fc+i ©5=i( ith bit of I c(»-l)+i) 

(2.4) Set w = u © u 

(2.5) Set y = w o y (o denotes concatenation) 

(3) Output y 

Note that this algorithm need not actually choose a new set of cfc - 1 elements in E k 
each time it calculates a new bit, as specified in line (2.1). This has been done here 
for clarity of exposition. In fact, when this algorithm is converted into the circuit A k , 
any set of cfc - 1 elements in E k can simply be "built-in" to this circuit. These same 
elements can be reused to get each of the k bits of fl l {x). This completes the proof 
of Lemma 1.2.1. Since / fc is assumed to be a one-way function, we have, in fact just 
shown: 

Corollary 1.2.1 No family of polynomial-size circuits C = {C k } can compute the 
predicate B = {B k } on all inputs. 
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Before going on to the next lemma, we introduce some notation. We can think of 
each (11, 12, ... , x ek ) € D k as consisting of k "groups", each containing c elements. The 
first "group" consists of xi, x 2 , ..., x e , the second "group" consists of x c +i, x e+2 , • • • , x 2e , 
and so on until the fcth group: x (fc _ 1)e+ i, x (jt _ 1)c+2 , • • • , x kc . As we saw before, the value 
of B k {xi, 12, ... , x ek ) is just a big exclusive-or of terms of the form B\[xj) for each xy 
in the first group, B\{xj) for each xy in the second group, and so on, until B k {xj) for 
each Xj in the fcth group. Thus, B k is just a big exclusive-or of the tth bit of f k {xj) 
for each xy in the tth group, where t ranges from 1 to k. 

Definition Let C = {C k } be a family of circuits where each C k takes inputs from the 
domain D k . Think of the circuit C k as trying to compute the predicate B k . Now, for 
each x£E k define the probability sf )k {x) as follows: 

af ik [x) =Prob[Cjt((xi, x 2 , • . • , x ck )) = B k { (xi, i 2 , . . . , x ek ))] 

where (xi, i2, • •- , x ek ) is randomly drawn from the elements of E k 
which have x in the tth group. 

In other words, suppose that (n, x 2 , . • . , x ck ) has x in the tth group, i.e., x = at least 
one xj for j = (t - l)c + 1 to ic. Then, s^x) is just the probability that C k will 
correctly compute the predicate B k on the input (xi, x 2 , . . . , x ck ). Intuitively, if s i>k {x) 
is large, then we have a good chance of being able to get the tth bit of f k (x) by 
repeatedly using the circuit C k . If s£ fc (x) is large for every value of t from 1 to k, then 
we have a good chance of being able to get all the bits of f k x (x) by repeatedly using 
the circuit C k . This last statement is made precise in the following lemma. 
Lemma 1.2.2 Let C = {C k } be a family of polynomial-size circuits where each circuit 
C k takes inputs from the domain D k . Suppose that there exists a polynomial V{k) 
and a constant d such that for all sufficiently large k we have, for a randomly chosen 
xeE k : 



Prob 



1 1 

s^l(x) > - H rr for every value of i from 1 to k 

l ' KK ' ~ 2 V[k) 



>->-h- 



Then, there exists a family of polynomial-size circuits A = {A k } where each A k has k 
Boolean inputs and for all sufficiently large k: 

A k (x) = /fc H 1 ) for at least a f raction 1 - rg of the x € £*• 

In other words, there exists a family of polynomial-size circuits which invert f k with 
high probability if, for a randomly chosen x € E k : 



Prob 



5 ^ (l) ^ l + W) *** s ° k{x) ~ 5 + W) ' ' ' ' ' *** s * ,fc(l) -^W). 



>-^- 



We will show in the proof of this lemma that if sf k {x) is large then we can.with a 
polynomial-size circuit, correctly compute the tth bit of f k \x). Thus, if sf <k {x) is large 
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for every value of i from 1 to Jfc, then we can, with a polynomial-size circuit, compute 
all of f k \x). The lemma simply says that if, for all sufficiently large k sf )k {x) is large 
for most of the x G E k , then we can, for all sufficiently large k , invert f k on most of 
the xeE k . 

Let us first see how to compute the first bit of f k 1 [x) for some x which has the 
property that sf(x) > \ + ^. On all inputs (ii,x 2 ,.. .,x cJt ) € E k which have x in 
the first group we know that: 

PToh[C k {{xi,X2,...,x ek )) = B k {{xi,x 2 ,...,z ek ))] > 2 + vXJt)' 

Given (xi, x 2 , . . . , x ck ) € £*, there are c elements in the first group namely, zi, 12, • • • , x c . 
Among these elements, let us say that ij occupies "position" j. Then, since C k correctly 
computes the predicate B k with probability at least \ + p^ whenever x is in the first 
group, there must be some position j from 1 to c such that C k correctly computes B k 
with probability at least 5 + yky whenever x is in position j. Since this position will 
be "built-in" to the circuit which computes f k l {x), we may assume, without loss of 
generality that it is position 1. 

Now, just as in Lemma 1.2.1, we begin by choosing some random elements 
i 2 e E k , x 3 € E k , . . . , x ck G E k and compute u where: 

u = C k { (1, /jb(x 2 ), /i(x3), • • • » /k(*cJb) )• 
Also just as before, we compute v, a big exclusive-or of terms of the form B k (f k (xj)): 

v = e BKhixj)) e e mf k (x c{i . 1)+j )) 

3=2 «=2i=i 

e k e 

— (first bit of Xj ) ® (tth bit of x c(i _ 1)+y ) 

j=2 t=2 j=l 

Finally, we calculate u v. If C* correctly computes jBjt on all inputs (as assumed in 
Lemma 1.2.1) then, as we saw earlier, u © v is the first bit of f k x (x). Here, however, 
we know only that C k succeeds with high probability when 1 is in position 1. In fact, 
Cfc may have computed the predicate B k incorrectly on the chosen input, in which 
case tt © v may well not be the correct first bit of fl\x). However, if we recompute 
u © v a number of times using a new set of inputs 12, X3, . . . , x c jt each time, we should 
expect to get the correct first bit about \ + ^ of the time. Saying it differently, if 

the average value of u © v is at least 2 + 77^ then tne correct first bit is probably 1, 
whereas if this average value is at most \ - p^r then the correct bit is probably 0. 

This method of repeated recalculation using different inputs is known as ''sampling" 
and each recalculation is called a "trial". Clearly, the more trials we perform, the 
more certain we will be about which bit to choose. Let avg be the average value of 
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u © v computed during sampling to determine the first bit of f k (x). Let avgtrue 
be the value of avg which we would obtain if we performed a trial for every possible 
distinct set of ck - 1 values x 2i x^...,x ck (i.e., avgtrue is the true value which avg 
is estimating). Now, suppose that we perform enough trials to be certain that avg is 
within l/(2y(ifc)) of avg t . Then, if the first bit is actually 1, we must have: 



. 1 1 
avg > - + 



2 V{k) 2V{k) 



so ' 1 



avg > - + 



2 2V{k)' 
Alternatively, if the first bit actually 0, we must have: 

1 — avg > - + 



2 V{k) 2V{k) 



so 1 

avg < - 



2 2V[k)' 

Thus, if we perform enough trials to be certain that avg is within l/(2V(A:)) of avgtrue 
then we can, with certainty, chcccc the correct first bit of fj.\x). We show below that 
this can be achieved with only a polynomial number of trials. 

It should be clear that, for any i, if s£ t (x) > \ + p^j then we can, with certainty, 

compute the ith bit of f k 1 [x) using the same method. Thus, for each x £ E k such that 
s? t ( x) > i + ttKt for every i from 1 to k, we can compute f k \x) with a polynomial-size 

t,JC\ / — £ y yicj if 

circuit. Since the hypothesis of this lemma assumes that, for infinitely many values of 
k, this is true for at least 1 - £ of the x € E k , we can construct a family of polynomial 
size circuits which, for infinitely many values of k, compute f k 1 (x) for at least 1 - ^ 
of the x 6 E k . 

It remains only to prove that we can calculate avg to within l/(2V(fc)) of avgtrue 
with only a polynomial number of trials. If we perform t trials of the sampling procedure 
describe above to compute avg then, by the Central Limit Theorem (see [HPS]) we 

have: 

Prob( \avg - avgtrue\ > c ) s=: 2(1 - $(5)) 

where 

6 = 2cVi 

and z , 

*(z) = J -j^e'^du. 

J -°° v^F 

Note that we can never be certain our estimate is within the bound c but only sure 
with arbitrarily high probability. If we wish to know how many trials are necessary in 
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order to be sure with probability e that avg is within a bound c of avgtrue, we must 
solve the following equation for t: 

2(1 - <P{2csft)) < c- W 

In fact, when our procedure is converted into a circuit, we will be able to "build-in" 
a particular set of t sets of inputs which can be used in sampling to assure that our 
estimate avg is off by at most l/(2V(fc)). By calculations which can be found in the 
Appendix, equation (1) above is satisfied if t is chosen such that: 

ln2-lne 

f> -!?— 

Here, we want c = l/(2V(ife)) and for reasons which will be made clear below, we set 

c = 1/2 2 * giving: 

t>(ln2-ln(2- 2fc ))(2V(fc)). 

This is certainly satisfied by: 

t > 6kV{k). 

We now show that, since t is polynomial in k, we can build a Poly{k) size circuit 
which correctly computes f k 1 {x) for every x £ E k such that sf )k [x) > 5 + y^. The 
argument is probabilistic. Suppose that t = &kV(k) and let 

W — {inputs input 2 , . . . , inputj 

be a set of t randomly chosen inputs where each input,- is a set of ck - 1 elements of 
Ek which will be used for sampling,i.e., for each t: 

input,- = {ii, 12, .-, Xck-i} each xy € E^ 

For a randomly chosen string x € E k such that s,^(i) > | + ^ for every value of 
i from 1 to k, let us say that the set W fails on x if, for at least one value of i, the 
value of of avg computed while trying to get the ith bit of /* 1 {x) differs from avgtrue 
by more than l/(2V(fc)). Since the chance of failing on any given bit of f k (i) is less 
than 1/2 2 * (because e = l/2 2fc above) and there are k bits we get: 

k 
Prob[W fails on i] < -gj. 

Since there are at most 2 fc elements in E k : 

fc2* 
Prob[ W fails on at least one x 6 E k ] < r^- 

Th ° S: ,. *»* „ 

Prob[ W does not fail on any xeE k ]>l- -pj- > 0. 
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So, there must exist some set W of t inputs which can be used in sampling to assure 
that for every x £ E k such that sf tk {x) > \ + yjjj, the value of /j" x (i) can be correctly 
computed with certainty. This set W can simply be "built-in" to the circuit A k which 
inverts f k . The entire algorithm for computing f k l {x) follows on the next page. 

Let x be such that s? k [x) > \ + pjjj. Then, by definition, the circuit C fc 
correctly computes the predicate B k with probability at least \ + yfe on all inputs 
{x l ,x 2 ,...,x ek ) € D k which have x in the ith group. As before, given such an input, 
we say that element Xj occupies "position" ; in the input. Since the ith group consists 
of elements x^xj+i through x ci , there must exist some position ;' from c(i - 1) + 1 
to ic such that C k computes B k with probability at least \ + p^y whenever x is in 
position j. This position will, for each t, be "built-in" to the circuit A k which inverts 
f k . We have assumed in the following algorithm, without loss of generality, that it is 
position c(i — l) + 1, i.e., the first position of each group. 
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Algorithm A2 

input: x£E k 

output: y E E k such that y = f k \x) with probability 1 - jrj 

(1) Set y = nuH 

(2) For h = 1 to k do 

(2.1) Set count = 

(2.2) Set t = 6kV{k) 

(2.3) For sample = 1 to t do 

(2.3.1) Choose next "built-in" input for sampling: 

Xi, X2, ■ . • , X c (/»-l)i x e{h-l)+2 x ck 

(2.3.2) Set u = C fc ( (/jt(n), • • • , /*(* c (fc-i)). *. /*(*e(fc-l)+2). • • • - /*(*«*)) ) 

(2.3.3) Set v = @iZ\ 0y = i(ith bit of x c ( t _i) +; ) 

®$=2 i hth - bit of I c(h-l)+j) 
©f-k+i ©y=i( ith bit of z c(i-i)+i) 

(2.3.4) Set count = u © w 

(2.4) Set avg = *f* 

(2.5) If avg > -^ + -^pj then set u; = 1 

(2.6) Elseif avg < \- -^ then set w = 

(2.7) Set y = wo y (o denotes concatenation) 

(3) Output y 
It should be clear that this algorithm can easily be converted into a polynomial size 
circuit A k with k Boolean inputs. This completes the proof of Lemma 1.2.2. Note that 
f k is assumed to be a one-way function which means that, for some constant d, no 
family of polynomial- size circuits can invert it with probability at least 1 - p. Thus 
we have actually shown: 

Corollary 1.2.2 Let / = {/*} be a polynomial time computable function which 
takes inputs from the domain E k and suppose that there exists a constant d such 
that any family of polynomial-size circuits which tries to invert f k is, for infinitely 
many k, correct for less than 1 - £ of the x € E k . Let C = {C k } be any family of 
polynomial-size circuits where each circuit C k takes inputs from the domain D k (i.e., 
each circuit C k is trying to compute the predicate B k ). Let V[k) be any polynomial. 
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Then, for all sufficiently large k we have, for a randomly chosen x £ E k : 



Prob 



s?Ax) > - + ^-ttt for every value of i from i to k 



2 V{k) 



<1 ~F- 



The following lemma completes the proof of Lemma 1.2 and therefore also completes 
the proof of Theorem 1. 

Lemma 1.2.3 Let I k be the set of all fc-bit strings and let / = {f k } be a weak one-way 
function over a domain E k C I k . Thus, there exists a constant d such that any family 
of polynomial-size circuits which tries to invert f k is, for infinitely many k, correct for 
less than 1 - A of the x £ E k . Let D k be a new domain consisting of the cartesian 
product of k 2d copies of E k , i.e.: 

At = { [xi, X2, ■ • • , x k 2d) | xi G E k , . . . , x k 2<t eE k } 

and let B k be the predicate over the domain D k which is described in Lemma 1.2. 
Note that we are finally giving the the value of the constant c which has been used 
until now to specify inputs in D k ; c is equal to fc 2d_1 . Now, let C = {C k } be any 
family of polynomial-size circuits where each circuit C k takes inputs from the domain 
D k and tries to compute the predicate B k . Let Q(k) be any polynomial. Then, for all 
sufficiently large k we have, for a randomly chosen (ii, X2» • • • » ^Jt 2 -*) £ ^k- 

Prob[ C k {{xi,x 2 x k 2i)) = B k ([xi,Z2,...,x k u))] < r + 



«(*)' 



Proof Let V[k) be any polynomial. Then, from Corollary 1.2.2, for all sufficiently 
large k we have, for a randomly chosen x £ E k : 



Prob 



c . . 1 1 



s?l(x) < - + -T7TT for some value of t from 1 to A; 
»' KV ' 2 V(k) 



1 



Then, as shown by Goldwasser[G], there exists a polynomial Q{k) = aV(k) for some 
constant a such that, for a random (x\, X2> • • • > x k id ) € D k : 

Prob[ C k ( (ii, i 2 , ... , Xjfc2«i) ) = Bfc( [xi, 12 ijt2j) ) 1 < g + n(M ' 

This is a somewhat complicated counting argument so the proof is omitted here. 
Details can be found in [G]. 



2.2 Statistical Tests for Strings 

It turns out that in order to be useful for simulating general probabilistic 
computation, we would like the output from a "good" pseudo-random bit generator 
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to be unpredictable in a slightly different way from that discussed so far. Until now, 
we have required that, given any prefix of a sequence output by our generator, it 
be computationally impossible to consistently predict the next bit in the sequence. 
Consider a somewhat different notion of unpredictability: Given some n-bit sequence 
output by the generator and some truly random n-bit sequence, it is computationally 
infeasible to distingush the two sequences with a better than 50-50 chance. Note that 
this kind of unpredictability involves the impossibility of distinguishing between two 
sequences whereas the next>bit test involves the impossibility of predicting a given bit 
The following definition formalizes this: 

Definition Let P be a polynomial, S k a multiset consisting of P(A;)-long bit sequences 
and S = \J k S k . A polynomial- size statistical test for strings is a family of circuits 
c _ {£*}. Each circuit C k has P{k) Boolean inputs, one Boolean output and size 
polynomial in F(Jfc). The multiset S passes the test C if for every polynomial Q, and 
all sufficiently large k: 

|p '" P?l< Q(fc) 

where p£ denotes the probability that C k outputs 1 on a randomly selected s € S k 

and p£ denotes the probability that C k outputs 1 on a randomly selected P(fc)-long 

bit sequence. We will refer to both C and C k as statistical tests. 

It turns out, as proven in the next theorem of Yao's, that these two notions of 

unpredictability are equivalent. Thus, the output of any CSB generator passes all 

polynomial-size statistical tests. 

Theorem 2 (YaofY]) Let P be a polynomial, S k a multiset consisting of P(fc)-long 

bit sequences and. S — \J k S k . The set S passes all polynomial- size next-bit tests if and 

only if it passes all polynomial-size statistical tests. 

Proof The easy half is to show that if S passes all polynomial-size statistical tests then 

it passes all polynomial-size next-bit tests. Suppose S = \J k S k fails a polynomial- size 

next-bit test. Then, by definition, there is a family of Poly{k) size circuits C = {C\} 

such that each C{ has i < P(k) Boolean inputs and one Boolean output. Furthermore, 

there exists a polynomial Q such that for each of infinitely many values of k there 

exists some i such that: . 

p & - 2 + WY 

Recall that pg,- is the probability that the circuit C\ outputs the correct t + 1st bit of 
a sequence s € S k when given the first i bits of 5 as input. 

We now show that S must fail a statistical test A — {A k } where each circuit A k 
has P{k) inputs and size Poly{P[k)). This statistical test simply uses the next-bit test 
as a subroutine. Suppose k is chosen such that the next-bit test succeeds with high 
probability on strings in S k . Then, there must be a value of t, such that the circuit C\ 
predicts the x'th bit of strings in S k with high probability. The statistical test circuit 
A fc operates as follows. It inputs the first i bits of its own input to the next-bit test 
circuit C k . It then compares the bit predicted by C\ with the true i + 1st bit of its 
own input. If the circuit C\ predicted correctly, then the statistical test Ak outputs a 
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1, otherwise it outputs a 0. Now, the next-bit test C\ succeeds with high probability 
given the first i bits of a string in S k but can only succeedat most half the time on truly 
random i-bit strings. Thus, the statistical test A k will effectively distinguish between 
sequences from S k and truly random P(fc)-bit sequences. 

More formally, let p£ denote the probability that A k outputs 1 on a randomly 
selected s G S k and p£ denote the probability that A k outputs 1 on a randomly selected 
P(fc)-long bit sequence. Then, by the argument above, we get for infinitely many pairs 
k and t: 



Pk =Pk,i 



and 1 - 



but , 

rf-i 



so 



„5 „R ^. 1 

Pk ~ Pk -Q{k)' 

Thus, S fails the statistical test A = {A k } and this half of the theorem is proven. 

The more difficult task is proving that if S = U S k passes all next-bit tests then it 
passes all polynomial- time statistical tests. The idea is as follows. Suppose S fails some 
statistical test C = {C k }. We want to construct a next-bit test for S. For infinitely 
many values of k, the circuit C k can distinguish between elements of S k and random 
P(ifc)-bit strings. The first obvious question is: Which of these P{k) bits should our 
next-bit test attempt to predict? It turns out that it is possible to find a value of t 
which has the following interesting property: If the circuit C k is given as input the first 
i bits of some string in s G S k followed by a random sequence of P{k) - i bits, it can 
detect with a certain probabilty whether the t + 1st bit of its input is the correct t + 1st 
bit of s. By repeating this experiment a number of times with different sequences of 
P(k) - i bits, it is possible to ascertain the i + 1st bit of s with a better than 50 percent 
chance. A circuit A\ which has the statistical test C k "built in" can be constructed 
to repeat this experiment the appropriate number of times and thereby constitute a 
next- bit test. 

Formally, suppose that a multiset S = [j k S k fails some polynomial- time statistical 
test. Then, by definition, there is a family of polynomial-size circuits C = {C k } such 
that C k has P(k) inputs and one output. Furthermore, there exists a polynomial Q 
such that for infinitely many values of k: 

lrf-rfl>gkj 

where p£ is the probability that C k outputs 1 on a randomly selected sequence in 
S k and p£ is the probability that C k outputs 1 on a randomly selected .P(ft)-long bit 
sequence. We now show that S must necessarily fail a next-bit test. 
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Let [x\i]y denote the concatenation of the first i bits of x with y. For each i < P{k), 
let p\ denote the probability that C fc ([i|i]y) = 1 where x is chosen randomly from S k 
and y is a random string of P{k) - i bits. Thus, p\ is the probability that the statistical 
test C* outputs 1 when given as input the first i bits of some string in Sk followed by 
P(k) - i random bits. Note that p° k = p£ and p£ (fc) = p£. 

We now construct a family of circuits A = {A x k } which constitute a next-bit test 
for S. Each A\ has t < P{k) inputs and size polynomial in P{k). Let A be chosen such 
that: < 

lrf-rfl> 



QW 



Assume, without loss of generality, that 

pf-rf> 



1 



Then, clearly, there must be a value of t such that 

(Remember, i varies between and P(fc).) The circuit A\ will correctly predict this 
i + 1st bit of any sequence in Sk with high probability. 

Recall that p\ +1 is the probability that C k outputs 1 on any string consisting of 
an i + 1-bit initial segment of a string in S k followed by P[k) - (i + 1) random bits. 
Let q[ +1 be the probability that Ck outputs 1 when given the first i bits of a string in 
S k followed by the incorrect i + 1st bit, followed by P{k) - (i + 1) random bits. Note 
that: • , 

1 .'-Li 1 






thus: 



Pk Pk— Pk 



-{W l ^ l HMm\ 



2 Pk ~2 q " -\Q(k)AP(k)) 

p * " 9fc -Q(k)P(k) [1) 



so: 



Pit > 9fc • 



Consider in particular line (1). The fact that the difference between p' fc +1 and q\ + is 
large means that if C k is given as input the first i bits of some string s € St followed 
by P(k) - i random bits, it can effectively distinguish between those strings containg 
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the correct t + 1st bit of a in their i + 1st position, and those containing the incorrect 
i + 1st bit. 

Since p\ +1 + 4 +1 = * and P* +1 > 9l +1 > if we can be sure of c ° rrectl y predicting 
the i + 1st bit of a randomly drawn string in S k with probability pj +1 f then we will be 
correct more than half the time. The next-bit test A x k is constructed to do this. 

So, suppose A\ is given as input 61, b 2 , .... b it the first i bits of a randomly chosen 
sequence s G S k . Let b be the correct i + 1st bit of s and let 5 be the incorrect i + 1st 
bit. The circuit A\ first calculates the probabilities: 



r f = Prob 



Ck{bu . . . , b it b, b i+2 , . . . , 6p(fc)) = 1 



and 



n = Prob 



Ck{h , . . . , b it 5, b i+ 2, • • • , h{k)) = 1 



where 6,-+2i---i*P(fc) are chosen randomly. In fact, these probabilities can only be 
estimated with high accuracy but for the sake of clarity, let us assume for a moment 
that they can be calculated exactly. A discussion on estimating the probabilities 
appears later in the proof. Note that A' fc does not actually know which is the correct 
and which the incorrect i + 1st bit. It simply calculates the two probabilities-one of 
them will be r c the other, r { . The predicting circuit Jt k now chooses 6 t+ i = b with 
probability ^ and chooses fc x+ i = b with probability ^. 

Suppose we calculate the value of r c for each i-bit prefix of a string in S k and take 
the average of all these probabilities. This average is just the total probability with 
which the circuit A* fc correctly predicts the i + 1st bit of a randomly drawn string in S k . 
This average is also equal to pj. +1 so A\ predicts correctly with probability p\ > 2 . 
Thus there must exist some polynomial R such that A\ correctly chooses the t + 1st 
bit for sequences in S k with probability at least \ + ^. The set S therefore fails 
the next-bit test A = {A\} and this part of the proof is finished. A discussion on 
estimating the probabilities r e and r,- follows. 

Estimating the probabilities 

Recall the probabilities r c and r{ which are used to predict the i + 1st bit: 



Prob 



C k {bi, .... b it b, b i+2 , ■ • • » &P(fc)) = ! 



and 



r t - = Prob 



C k {h, . . . , bi,l, b i+2 , ..., b P { k )) = 1 



where b is the correct i + 1st bit and I is the incorrect i + 1st bit. As mentioned above, 
these probabilities cannot be calculated exactly. Here we show that they can, however, 
be estimated closely enough so that the next-bit test will still succeed with more than 
a 50-50 chance. 
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The idea is as follows. Suppose the next-bit test circuit is given as input the bits 
b if . . . , b it an z'-bit prefix of some string in S k . In order to estimate the probability r c 
the nextrbit circuit first chooses a sequence of random bits 6 t+2 » •••> h{k) ^^ then uses 
the statistical test C k as a "subroutine" to calculate whether 

C k { b lt 62, • • , hi, b, b i+2 , • • • , b P ( k ) ) 

causes a or a 1 to be output. The next-bit circuit then repeats this same procedure a 
number of times using different random sequences b i+2 , • . . , &p(fc)- Finally, it takes the 
average of all the 0,1 values output by the circuit C k and uses this as the estimate for 
r c . The estimate for r; is computed similarly. As we saw in lemma 1.2.2, this method 
is known as "sampling" and each repetition of the procedure is a "trial". Note that 
the next-bit circuit does not actually know whether 6 = or b = 1. It simply does 
this "sampling" first with b = 1 and then with b = 0. One of the results will be an 
estimate for r c , the other an estimate for r,\ 

Clearly, the more trials we make the closer the resulting estimates will be to r e 
and r t -. The question is, for each input of t bits, how far from r e and r t - can these 
estimates be and still assure that the next-bit circuit will correctly predict the t + 1st 
bit more than fifty percent of the time? 

Recall thatp' fc +1 is the probability that the statistical test C k outputs 1 when given 
as input the first i + 1 bits of a sequence in S k followed by P{k) - (i + 1) random bits. 
Also, q[ +1 is the probability that C k outputs 1 when given as input the first t bits of a 
sequence in S k followed by the incorrect i + 1st bit, followed by P{k) - (t + l) random 
bits. Thus, p[ +1 is just the average of all values of r c taken over all i-bit prefixes and 
<^ +1 is, similarly, the average of all values of r { . If we could calculate each r c and r t - 
exactly, our next-bit circuit would predict the i + 1st bit correctly with probability 
pj. +1 . From above, we know that: 

.i +1 -o^ 1 > 



?r - 9* ^ 



P(k)Q(k) 



where P and Q are both polynomials. Thus, if each of the estimates for r e and r,- is 
off by less than l/{P{k)Q{k)) then the resulting estimate for p\ +1 wUl still be greater 
than the estimate for q^ 1 . So, the next-bit circuit will still succeed more than half the 
time. 

We must therefore show that with only Poly{k) trials, we can get estimates for 
r e and r t - which are off by less than l/{P{k)Q{k)). This is necessary to insure that the 
next bit circuit has size polynomial in k as required. 

Let p be a probability which is estimated by a value p, calculated by performing t 
trials of the sampling procedure described above. Then, by the Central Limit Theorem 
(see [HPS]) we have: 

Prob(|p-p|>e)« 2(1 -*(*)) 

where 

6 = 2cy/t 
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and f z % 

Note that we can never be certain our estimate is within the bound c but only sure 
with arbitrarily high probability. If we wish to know how many trials are necessary in 
order to be sure with probability e that our estimate is within the bound c, we must 
solve the following equation for t: 

2(1 - $[2cy/i)) < c. (1) 

In fact, when our procedure is converted into a circuit, we will be able to "build-in" 
a particular set of t strings of bits which can be used in sampling to assure that our 
estimates are off by less than l/[P{k)Q{k)). By calculations which can be found in the 
Appendix, equation (1) above is satisfied by: 

ln2-lnc 

As discussed above, we want c = l/{P{k)Q(k)) and, for reasons which will be made 
clear below, we set e = l/2 p M: 

t>i(ln2-ln(2-^)))((F(fc)) 2 (Q(fc)) 2 ). 

This is certainly satisfied by: 

t > (P(k)) 2 (QW? + (P(k))\Q{k)) 2 - 

We will now show that if the number of trials t satisfies this inequality then 
the next-bit circuit can actually be built in such a way that, given any i-bit prefix 
of a string in S* as input, the circuit can, with certainty, estimate r c and r,- with 
an error of less than l/{P{k)Q[k)). The argument is probabilistic. Suppose that 
t = [P{k)) 2 {Q{k)f + [P{k)f(Q{k)) 2 and let W = {w u ...,w t } be a set of random 
strings each of length P(k) - (i + 1). In other words each element ti;,- of W is a string 
of random bits: 

«>» = b{+2 , fe,+3i ■ • • > &P(Jfc) ■ 

These strings will be used to do the sampling. Let i be a randomly chosen i-bit prefix 
of a string in S fe . We will say that the set W fails on x if, given x as input, at least one of 
the probabilities r c and r { cannot be estimated with an error of less than l/(P(k)Q[k)) 
when using all of the strings in W to do the sampling. Since the chance of failing on 
either r e or r t - is less than l/2 p W (because c = l/2 p W), we get: 

2 

Prob[W fails on a random i-bit prefix of a string in S*] < 



2^(*) 
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Since there are at most 2* t'-bit prefixes: 

2 

Prob[W fails on at least one i-bit prefix of a string in S k ] < — — 



(*"). 



Since the next-bit circuit is trying to predict the t + 1st bit and strings in S k have 
length P(k), we know that z can be at most P(k) - 1. Therefore, 

Prob[W fails on at least one »-bit prefix of a string in Sk] < 1- 

So, there must exist at least one set W of t strings which does not fail on any inputs to 
the next-bit circuit. This set W will be "built into" the next-bit circuit, assuring that 
this circuit can always estimate the probabilities r c and r,- with sufficient accuracy to 
be sure of predicting correctly more than half the time. Since t is polynomial in k, the 
next-bit circuit will have size polynomial in k as required. A complete algorithm for 
computing the next-bit is given on the next page. It should be clear that this procedure 
can be converted into a polynomial size next-bit test. 
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Algorithm A3 

input: an z-bit prefix of a string i G Sfc given as b\, . . . , 6,- 

output: y = i + 1st bit of i (with probability \ + ^m for some polynomial #(fc)) 

(1) Set countQ = 

(2) Set counti = 

(3) Set t = (P(A:)) 2 (Q(fc)) 2 + (P(/c)) 3 (Q(fc)) 2 

(4) For sample = 1 to £ do 

(4.1) Choose next "built-in" input for sampling: 



(4.2) Set countQ = 


= counto + Cjb(^i>" 


• , kj 0, ii+2) • 


• • > b P{k) ) 


(4.3) Set counti = 


= count\ + Cjt(6i,.. 


• , k, 1. &i+2> • 


• • » &P(Jfe) ) 


(5) Set q = £2H r a 








(6) Set 9 i = **fi 









(7) Output y = with probability -^ r 
Output y = 1 with probability qo q | qi 

This concludes the section on estimating the probabilities as well as the proof of 

Theorem 2. Since the output of any CSB generator passes all next bit tests, we have 

shown: 

Corollary 2 Let G be a CSB generator whose output consists of the set S = Ujk S* 

as defined above. Then S passes all polynomial-size statistical tests. 

2.3 Simulating Probabilistic Computation 

It remains to explore the usefulness of CSB generators in simulating general 
probabilistic computation. It is here that the notion of statistical tests and the result 
of Theorem 2 will be helpful. First, we give two standard definitions from complexity 
theory. 

Definition The class DTIME{T(n)) denotes the family of languages accepted by 
deterministic Turing machines which halt after at most T(n) steps, where n is the 
input length. 

Definition A language L is in the class K ("Random polynomial time") if and only if 
there exists a polynomial P and a Turing machine Mr such that: 
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(i) Mr takes two inputs x and y;input x is to be checked for membership in L and the 
meaning of input y will become clear below. The machine Mr runs in time P(|x|). 
(ii) if x £ L then for all y such that \y\ = P{\x\) we have PTob{M R [x,y) accepts} = 0. 
(iii) if x e L then for all y such that \y\ = P{\x\) we have Prob{M fl (x, y)accepts} = J. 

The input y in the above definition can be thought of as a random sequence of bits 
or coin flips which may witness the membership of i in L. For the sake of clarity in our 
definition, the string y is given to Mr as input and the machine then computes a result 
deterministically which depends only on i and y. Intuitively, one may also choose 
to think of Mr as being given only x and actually generating the string y through 
some random selection process such as coin flips. In this setting, Mr is considered a 
"probabilistic" machine. Given some input x, if Mr outputs 1, we know for certain 
that x e L but if Mr outputs we know only that x g L with probability \. The 
computation of Mr on i can be repeated a number of times so that we can be sure of 
our answer with arbitrarily high probability. 

In terms of our original definition, if we run Mr{x, y) for each of the 2 P ^ possible 
values of y and the machine outputs every time, then we know with total certainty 
that x £ L. Thus, we have essentially shown the following: 

Fact RC\J e>0 DTIME{2 n '). 

Proof Let L G £ be accepted by a probabilistic machine Mr in time P{n) where P 
is a polynomial. The language L can be accepted by a deterministic machine M as 
follows. On input x machine M runs M R {x,y) on each of the 2 P ^ possible strings 
y of length equal to P(|x|). The machine M accepts x if and only if there exists 
some value of y such that M R (x, y) accepts. Otherwise M rejects. It should be clear 
that M acepts exactly the language L in time 2 Q ^ for some polynomial Q. Thus 
L£[) t>0 DTIME(2 n '). 

The class \J (>Q DTIME{2 n ') is also known as "exponential time". Problems 
requiring exponential time are considered intractable since the cost of solving them 
by computer quickly becomes prohibitive. If we could somehow avoid having to test 
every string y of length P(|i|) in our simulation of machine Mr, this simulation could 
obviously be sped up. A natural question at this point is whether the bit sequences 
output by a CSB generator are sufficiently "random" to be useful in this context. Yao 
has answered this question affirmatively by proving that such a generator can in fact 
be used to deterministically simulate a probabilistic machine and that the simulation 
requires only "sub-exponential" time. 

Theorem 3 (Yao[Y]) Suppose that for any polynomial P, it is possible to construct 
a CSB generator which takes k-bit inputs and produces P{fc)-bit outputs. Then 

jjcn £>0 ^^2 n '). 

Proof Let L 6 £ be accepted by a probabilistic machine Mr. Suppose that Mr runs 
in time n> where n is the input length. We construct a deterministic machine M which 
accepts L in sub- exponential time. In the previous, exponential-time simulation, the 
machine M whould have tried every possible witness y of length n 3 in simulating the 
probabilistic machine. There are 2"' such strings which is what forces the exponential 
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time bound. Here, M will only try those witnesses of length n 3 which are output by 
the CSB generator. Suppose, for example that the generator takes /c-bit inputs and 
produces fc c -bit outputs. Thus, an input to the generator (a "seed") of length n 3 ' c will 
produce an output of length n 3 . The deterministic machine M inputs each string of 
length n 3 l c to the generator producing a possible witness y of length n 3 . M simulates 
Mr{x, y) for each such y and accepts if and only if M R ever accepts. Since there are at 
most 2 n ' /c seeds of length n 3 '! c , the entire simulation can be carried out in time 2 n 
for some fixed d. By assumption, the generator can be constructed to "stretch" its 
inputs by any polynomial amount. Here, this means that the value of c can be chosen 
to be any fixed value. Thus, for any e, the simulation can be carried out in time 2 n . 

Note that in our original definition of CSB generators (Lemma 1.2), we assumed 
only that each seed could be found in probabilistic polynomial time. Here, where we 
want to use the CSB generator for a deterministic algorithm, we simply try every string 
of length n>l c as an input to the generator. Some of these might not be actual seeds 
for the generator, hence the resulting output might not be a pseudo-random sequence. 
However, this method will cover every possible seed which is all that is necessary here. 

It remains to show that the machine M accepts the same language L which is 
accepted by Mr. As usual, let I k be the set of inputs to the generator of length k 
and let S k = {s x \x G /*}■ We will show that if M does not accept L, then the set 
5 _ |j fc s k output by the generator fails a polynomial-size statistical test. The relative 
ease of this proof should make clear the reason for introducing statistical tests as 
criteria for our generators. 

Recall that if x £ L then for a randomly chosen string y, Mr[x, y) accepts with 
probability \. If x £ L, then for no string y does M R {x,y) to accept. So if M does 
not accept the language L, there must be some z € L which M incorrectly rejects 
(i.e., none of the possible witnesses w output by the CSB generator cause Mr[z,w) to 
accept even though z € L.) Furthermore, there must exist infinitely many such strings 
z € L or else M could be repaired without increasing its running time. 

Let z be any element of L which M fails to accept. Recall that the probabilistic 
machine Mr runs in time n 3 ' where n is input length. So, on input z, the deterministic 
machine M will simulate M R {z,y) on all strings y of length \z\ 3 output by the CSB 
generator. Thus M will have to try all seeds of length \z\ 3 / c since the generator 
"stretches" fc-bit inputs into k e -hit outputs. For the remainder of the proof, we will 
write k in place of \z\ 3 ' c . 

For each of the infinitely many such strings z, there will be a circuit C k with 
£c _ [\ z \ihy = \ z \ 3 ' inputs which has z "hard-wired" in. On any input y, the circuit 
Ck simulates M R {z, y). This circuit outputs 1 if and only if M R {z, y) acepts. 

Since y G L we have, by the definition of R, for a randomly chosen string y of 
length \z\ 3 : 

p« = Prob^y) = 1} = -• 

However, since M R {z,y) does not accept for any y 6 Sk, for a randomly chosen y £ S k 
we have: 
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p£ = ?rob{C k (y) = 1} = 0. 

PC 1 

Thus Pk ~Pk = i so tnere is certainly a polynomial # such that p k -p k > j^q- 
Therefore, the set S fails the statistical test {C k } contradicting our assumption that 
S is the output of a CSB generator. This concludes the proof. 

Recall now Theorem 1 which said that given any weak one-way function, it is 
possible to construct a CSB generator which "stretches" its seeds by any polynomial 
amount. Combined with Theorem 3, we get: 

Corollary 3 If there exists a weak one-way function, then R. C f] t>0 DTIME{2 n ). 
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Chapter 3 
Refinements of Yao's Theorems 



In this chapter we refine somewhat Yao's results presented in Chapter 2. More 
specifically, we assume the existence of two new kinds of one-way function, both in turn 
different from tha weak one-way function described in Chapter 2. We then explore the 
types of pseudo-random bit generators which can be constructed from these functions. 



3.1 A Low-Level Refinement 

Recall that in Chapter 2 we defined a weak one-way function as being polynomial- 
time computable but difficult to invert by any family of polynomial-size circuits. Here, 
we consider an even more restricted definition. Informally, an R-weak one-way function 
is polynomial- time computable but difficult to invert by any family of R{k)-size circuits 
where R(k) is some polynomial. A rigorous definition of these functions follows the 
definition of "O notation" : 
Definition Let /(n) and g{n) be functions. We say that: 

f(n) = 0\g(n)) 
if there exists some constant c such that for all sufficiently large values of n: 

/(«) < c ff( n )- 

Definition Let I k be the set of all fc-bit strings, let D k C I k and let f k :D k » D k be a 
sequence of permutations. We will write D = U D k and / = {/*}. Let R{k) be some 
polynomial. Then, / is an .R-weak one-way function (or simply an fl-one-way function) 
if the following properties are satisfied: 

(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input k, selects an x € D k with uniform probability. 

(2) There exists a polynomial-time algorithm which, on inputs k and x £ D k , computes 

(3) There exists a polynomial Q such that the following holds. Let C = {C k } be 
any family of circuits where each C k has k inputs and size 0(R[k)). Then for all 



sufficiently large k: 

Cfc(i) 7^ fk l [x) for at least a fraction ^-r of the x 6 D k . 

Just as we did in Chapter 2, it is possible to construct a pseudo-random bit 
generator from an P-one-way function. However, as might be expected, the ^ output 
from such a generator cannot be expected to pass all polynomial-size next bit tests. 
Instead, we need a weaker notion of such a test, called an RQ-next-bit test Intuitively, 
a set of strings passes an PQ-next-bit test if no family of size P(fc) circuits can, 
infinitely often, predict the next bit with probability at least \ + ^. 

Definition Let P, Q and R be fixed polynomials, S k a multiset consisting of P(fc)-long 
bit sequences and S = \J k S k . An RQ-next-bit tut tor S is a family of circuits 
C = {C\}. Each circuit C\ has i Boolean inputs where i < P{k), one Boolean output 
and size 0{R[k)). On input the first i bits of a sequence s randomly selected from S k , 
C\ will output a bit 6. Let pg t - denote the probability that b = the i + 1st bit of a. We 
say that S passes the test C if for all sufficiently large k and all i < P{k): 

r 1 1 
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We will refer to both C and C\ as .RQ-next-bit tests. 

Note the differences between this definition and the definition of general next-bit tests 
given in Chapter 2. Here, we restrict both the size of the circuit doing the predicting, 
as well as the degree of accuracy to which it can predict. In general, we will expect that 
the polynomial R, the size of the next-bit circuit, will depend on the poynomial Q, the 
accuracy threshold. The reason for this will be made clear below. Finally, we define 
the analogous notion of the CSB generator, an RQ-weak pseudo-random bit generator: 
Definition Let P, Q and R be fixed polynomials, I k the set of all strings of length k, 
and D k C I k a set of inputs (or "seeds") of length k. Let G be a deterministic algorithm 
which, on input a seed x € D k , outputs a P(A;)-long bit sequence s x in Poly{k) time. Let 
S k = {s x \x 6 D k }. The algorithm G is a Cryptographically RQ-weak pseudo-random 
bit generator (or simply an RQ-weak generator) if the multiset S — \J k S k passes every 
PQ-next-bit test. 

Recall that in Chapter 2 we showed, given a function which was difficult to invert 
with any polynomial-size circuit, how to construct a bit generator whose output passed 
all polynomial-size next-bit tests. Here, given a function which is difficult to invert 
with some fixed polynomial-size circuit, it would be nice to construct a generator 
whose output passed all next-bit tests of some, possibly different, fixed polynomial size. 
Unfortunately, using the techniques of Chapter 2, this does not seem possible. The 
logic behind the proof in Chapter 2 was that if there existed a polynomial- size next-bit 
test which succeeded with probability \ + ^ for some polynomial Q, then it would be 
possible to construct a polynomial-size circuit to invert the assumed one-way function 
on most inputs. Unfortunately, the size of this circuit depended on the polynomial 
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Q(k). Thus if we assume the existence of a C/-one-way function, it is only possible, 
using these methods, to construct a bit generator whose output passes all next-bit 
tests of some fixed size which succeed with probability less than \ + ^p for some fixed 
polynomial Q. We now state the main theorem of this section: 
Theorem 4 Let I k be the set of all strings of length fc and let / = {/*} be a [7-one-way 
function over some accessible domain E k C I k . Thus, there exists a constant d such 
that, given any family of circuits C = {C k } where each C k has fc Boolean inputs and 
size 0(U(k)) we have, for all sufficiently large k: 

Cjt(x) 7^ f k 1 {x) for at least a fraction -rg of the x G D k . 

By definition, f k {x) can be computed in polynomial time for all x G E k thus by a 
standard result from complexity theory (see [FP]) we may assume that there exists 
some polynomial F{k) such that for each value of k, f k {x) can be computed by a circuit 
of size F{k) for all x G E k . Then, for any polynomial P, it is possible to construct an 
i2Q-weak generator which "stretches" seeds of length k into outputs of length P{k), 
where: . v 

Proof We first show a set of conditions, very similar to those given in Lemma 1.1 
which are sufficient for constructing weak generators. We then show how to satisfy 
these conditions, given any C7-one-way function. 

Lemma 4.1 Let I k be the set of fc-bit strings and let D k C I k . Let g k :D k >-> D k be 
a sequence of permutations and let B k :D k ^ {0,1} be a sequence of predicates. We 
will write D = \J k D k , g = {g k } and B = {B k }. Let P{k) be any polynomial. If the 
following set of conditions hold, then it is possible to construct an RQ-v/eak generator 
which "stretches" fc-bit seeds into P(fc)-bit outputs. The polynomial R is equal to: 

R{k) = T{k)-P{k)size{B k {g k )) 

where T(k), and size(B k (g k )) are defined below. 

(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input fc, chooses x G D k with uniform probability. 

(2) There exists a polynomial- time algorithm which, on inputs fc and x£D k , computes 
9k[x). 

(3) There exists a polynomial- time algorithm which, on inputs fc and x G D k , computes 
B k [g k [x)). Thus, B k {g k {x)) can be computed by a polynomial-size circuit. Let this 
circuit have size size(B k (g k )). 

(4) Let C = {C k } be any family of circuits such that each C k has fc inputs and size 
0{T{k)) for some fixed polynomial 7>nd let Q be some fixed polynomial. Then, 
for all sufficiently large fc: 

C k (x) ^ B k [x) for at least a fraction - - ^ttx of the x e D k . 
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Note that the polynomial T may well be a function of the polynomial Q thus, as 
expected, R will probably depend on Q. 

Proof of Lemma 4.1 First we construct the weak generator and then prove that its 
outputs must pass every RQ-next-bit test. The proof is very similar to that of Lemma 
1.1. 

Choose an appropriate value of k to be the seed length and choose in probabilistic 
polynomial time a random x £ D k to be used as the seed. Set c = P{k), the desired 
length of the output sequence, and generate the bits: 

B k (g k {x)) ) B k (gl(x)),...,B k (g c k (x)). 

The notation g\ indicates the j-fold composition of g k . Now, output these bits in 
reverse order, i.e.: 

B k (gi(x)),B k (g e k -\x)),...,B k (g k (x)). 

It should be clear that all of this can be accomplished in polynomial time by conditions 
(2) and (3). 

It remains to show that the sequences output by this generator pass every 
i?Q-next-bit tests where: 

R[k) = T{k)-P{k)size{B k [g k )) 

where P(k) is the amount that the generator "stretches" k-bit seeds, size(B k {g k )) is the 
minimum size circuit computing B k (g k {x)), and the predicate B k cannot be computed 
with probability at least \ + ^r by any circuit of size 0{T{k)). 

Suppose that this is not true. Then there exists a family of circuits C = {C l k } where 
each circuit C\ has i < P(k) inputs, size 0(R(k)) and the following holds. For each of 
infinitely many values of k there exists some i such that: 

c 1 1 

Pk ' i -2 + Q[k~) 

where p£ t - is the probability that the circuit C\ outputs the correct t + 1st bit of a 
sequence when given the first i bits as input. 

We now construct a family of circuits A — {A k } where each A k has k inputs, 
size 0(T(k)) and such that for infinitely many values of k, A k correctly computes the 
predicate B k (x) with probability at least \ + ~r. This contradicts condition (4). 

Choose one of the infinitely many values of k such that for some i < P{k): 

C >. 1 1 

Pk ^2 + WY 
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The circuit A k uses C\ as a "subroutine". On input x G D k , the circuit A k first 
generates the i-bit sequence: 

B k {g k {x)),B k {g\-\x%...,B k {g k {x)) 

and inputs this sequence to the next-bit test circuit C\. This can be accomplished 
with 0{i{size{B k {g k )))) circuit gates by condition (3). The circuit A k then outputs 
whatever value C\ outputs on these bits making the total size of A k equal to: 

&izt[A k ) = 0{ i{sizt{B k {g k ))) + R{k) ) 

= 0{ i{size{B k {g k ))) + T{k) - P{k){size{B k {g k )) ) 
= 0(r(fc))(since t < P{k)) 

By showing that the circuit A k solves the predicate B k with probability at least 2 + Q(jfc) 
we get a contradiction to condition (4). 
Note that the bits: 

B k (gi(z)),B k (g k - 1 (x)),...,B k (g k (x)) 

are the first i bits of the CSB sequence: 

B k (gi(x)), . . . , B k (g k (x)), B k (x), . . . , B k (g k - C {x)). 

Since the i + 1st bit of this sequence is B k {x), the circuit A k will correctly compute 
B k (x) whenever the circuit C\ correctly outputs the i + 1st bit. Furthermore, thejeed 
of this sequence is g*^' 1 ^) and since g k is, by assumption, a permutation, so is g\ 
Thus, ffjf c_1 generates all possible seeds z € D k which means that the next-bit circuit 
C\ will correctly output the i + 1st bit of this particular sequence for a fraction at 
least 2 + ofn- Thus, the circuit A k computes B k {x) for a fraction at least \ + ^ of 
the x e D k and has size 0{T{k)) contradicting condition (4). This completes the proof 
of the lemma. 

We must now show that given any U- weak one-way function over an accessible domain 
E = Ut B k , it is possible to construct a new domain D = Ui &k> a function g = {g k } 
and a predicate B = {B k } which satisfy the four conditions of Lemma 4.1. 
Lemma 4.2 Let / = {f k } be a [/-weak one-way function over an accessible domain 
E = [} k E k . Then, by definition, there exists some constant d such that given any 
family of circuits C = {C k } where C k has k inputs and size 0[U{k)), we have, for all 
sufficiently large k: 

C k (x) y£ fl 1 (x) for at least a fraction -^ of the x £ D k . 

Furthermore, assume that f k (x) can be computed by a circuit of size F{k). The 
following construction satisfies conditions (l)-(4) of lemma 4.1: 
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(a) Set D k to be the cartesian product of k 2d copies of E k . It will be less cumbersome 
to write ck instead of k 2d so we will generally choose to do this, as in Chapter 2. 
Thus, D k is the cartesian product of ck copies of E k wlrere c = k 2 K Formally: 

D k = {[xi,x 2 ,...;x ek )\x l eE k ,...,x ek €E k }. 

(b) Let g k { (n, 12, ... , x ek ) ) = ( /jt(n), f k {x 2 ), • • • , /*(*<*) ) where each xy G £*• 
(cl) Let B' fc (x) = the tth bit of fl\x) where x G £*• 

(c2) Let 

lb e 
Bfc( (xi, X 2 , • • • , Icfc) ) = ® fll(*e(.--l)+j) 

t =i y=i 
where each xy 6 £* and "©" denotes the "exclusive-or" operation. 
Proof of Lemma 4.2 It should be fairly obvious that conditions (1), (2) and (3) of 
Lemma 4.1 are satisfied by this construction. The domain D k is certainly accessible 
since, by assumption, E k is accessible. Given any (xi,x 2 ,...,x c jfc) € E ki the function 
g k {(x 1 ,X2,...,x ek )) can be computed in polynomial time since, by assumption, each 
f k {xj) is computable in polynomial time. Finally, the predicate: 

B k { g k {x u x 2 , • • • , x cfc ) ) = B k { (/t(xi), /t(x2), • • • , /*(*<*)) ) 

= ©©M(A(^(,-i)+,)) 
,-=i y=i 

can easily be computed in polynomial time since for any x € D k : 

B\{ fk{x) ) = the tth bit of rfifki*)) 
= the tth bit of x 

In fact, for any (xi,x 2 ,...,Xdb) € D k , the predicate B k ([{xi,x 2 ,...,Xck))) can be 
computed by a circuit of size 0{ck) = 0{k 2d ) since it simply consists of k 2d exclusive-ors. 
Thus, the value here of size{B k (g k )) in condition (3) above is: 

size(B k (g k )) = Oik 2 *). 

The more difficult task is to prove that, as constructed, the predicate B k satisfies 
condition (4) of Lemma 4.1. In particular, we must show that for an appropriately 
chosen polynomial T[k) the following holds. Let C = {C k } be any family of circuits 
such that each circuit C k takes inputs from the domain D k constructed in (a) above, 
and has size 0{T(k)). Let T{k) be set to: 



were Q[k) is a polynomial. We will show that for every such family of circuits and all 
sufficiently large k: 

C k {{xux 2 ,...,x ek ))^B k {{xi,X2,...,x ek )) 

for at least a fraction - - -rrjr of the (11, x 2 , . . . , x ek ) € D k 

2 Q[k) 

Actually, we will choose to formulate this problem slightly differently. For every such 
family of circuits {C k } and every we will show, for all sufficiently large k: 

1 1 

Prob[C|t((xi,i2, • • • ,z c fe)) = B k [{x u x2,. . • ,x ek ))] < - + ^j 

where (n, i 2 , . . , x ck ) is a randomly chosen element of D k . This will be proven through 
a sequence of two lemmas. First, we repeat the definition of the probability s ifk defined 
in Chapter 2. For details on what a "group" is, see page 15. 

Definition Let C = {C k } be a family of circuits where each C k takes inputs from the 
domain D k . Think of the circuit C k as trying to compute the predicate B k . Now, for 
each x£E k define the probability s,^(i) as follows: 

s f k ( x ) =Prob[C fc ((zi,z 2 , ..-, ZcJfc)) = B k {{xi,x 2 , .... x ek ))] 

where (ii, x 2 , . . : , x ck ) is randomly drawn from the elements of E k 
which have i in the ith group. 

Lemma 4.2.1 Let C = {C k } be a family of circuits where each circuit C k takes inputs 
from the domain D k . Suppose that there exists a polynomial V[k) and a constant d 
such that for all sufficiently large fc we have, for a randomly chosen i € E k : 



Prob 



s?u(x) > - + =T77T for every value of t* from 1 to & 

l,K\ i — n Vlfcl 



2 T V{k) 
If each circuit C k has size 



>i-p. 



Ki*-^} 



Then, there exists a family of circuits A = {A k } where each A k has k Boolean inputs, 
size 0(U{k) and for all sufficiently large A;: 

Afc(i) = f\~ 1 {x) for at least a fraction 1 - ^ of the x G E k . 

Proof The main details of the proof are identical to the proof of Lemma 1.2.2 
in Chapter 2. The only remaining step necessary here is to show that,, under the 
assumptions of this lemma, Algorithm A2 on page 20 can be converted into a circuit 
A k of size 0{U{k)) We reproduce this algorithm on the next page. 
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Algorithm A2 

input: x£E k 

output: y £ E k such that y = /^{x) with probability 1 - ^ 

(1) Set y = null 

(2) For h = 1 to k do 

(2.1) Set count = 

(2.2) Set t = 6fcV(fc) 

(2.3) For sample = 1 to t do 

(2.3.1) Choose next "built-in" input for sampling: 

II, 2=2, • • • , Zc(fc-1)> x c(fc-l)+2 z cfc 

(2.3.2) Set u = C k { (/jb(ii), • • • , /fc(z c {ft-i))> *> /fc( a: c(fc-i)+2)» • • • » /*(*<*)) ) 

(2.3.3) Set v = ©JzJ ©5=i(ith bit of i c ( t _i)+y) 

©y=2 (fc* bit of x c(h-l)+i) 

®Lh+i ©y=i( zth bit of *e{t"-i)+y) 

(2.3.4) Set count = u t> 

(2.4) Set avg = *»jj* 

(2.5) If avg > ^ + -^ then set u> = 1 

(2.6) Elseif avg < \- -^py then set w = 

(2.7) Set y = u> o y (o denotes concatenation) 

(3) Output y 

Let us analyze the "circuit size" of this algorithm. There are two loops, one of k 
iterations (step (2)) and one of t iterations (step (2.3)). Inside the inner loop, steps 
(2.3.1) and (2.3.3) each require 0{ck) circuit gates and step (2.3.2) requires size: 

6{ckF{k)+&izt{C k )) 

where F(k) is the assumed circuit size for computing f k and size[C k ) is the size of 
the circuit C k which is computing the predicate B k . Thus the entire algorithm can be 
converted to a circuit A k which computes f k * with probability at least 1 - £ and has 

size: 

0{ kt[ckF[k) + size{C k )} ). 
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The constant c has been defined to be k 2d ~ l and from step (2.2) we see that t = 
0(kV(k)). Also, by assumption, we know that: 



Thus, the circuit A k has size: 



o(k 2 V{k) 



lM«™ , tfi*L _ /^(jfe) 



k 2d F{k) + 



k 2 V{k) 



) = 0(U{k)). 



This completes the proof of the lemma. Since f k is assumed to be a If-one-way function, 
we have actually shown: 

Corollary 4.2.1 Let C = {C k } be any family of circuits where each C k takes inputs 
from the domain D k and has size: 



<i%-^4 



^V{k) 
Then for all sufficiently large k we have, for a randomly chosen x € E k : 



Prob 



s?h(z) > - + 7777-7 for every value of t from 1 to k 
t,K\ / _ 2 V[k) 



<1- 



k*' 



We need one more lemma to finish the proof of Theorem 4. 

Lemma 4.2.2 Let C = {C k } be any family of circuits where each C k takes inputs 

from the domain D k and has size: 



°K-"H- 



Then, for a randomly chosen (ii, x 2 , . • ■ , x ck ) 6 D k we have, for all sufficiently large k: 

... 1 1 

Prob[C fc ((ii,i2, • • • ,x ek )) = B k ((xi,x 2 , . . . ,x ek ) )] < - + 



2^Q{kY 

Proof Suppose first that some family of circuits C = {C k } has the property that each 
Cjt takes inputs from the domain D k and has size: 



i^-^H 



(i) 



for some fixed polynomial V(fc). Then, from Corollary 4.2.1, for all sufficiently large k 
we have, for a randomly chosen x € E k : 



Prob 



sf k (x) < - H 7TT for some value of i from 1 to k 



1 
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From Lemma 1.2.3 of Chapter 2, it then follows that there exists a constant a and 
polynomial Q such that V(k) = aQ(k) and: 

Piob[C k {[x ll x 2 ,... ) x ck )) = B k {{xux 2 ,...,x ek ))] < 2 + Q(jfc)' 
Replacing V(k) by aQ{k) in line (1) above gives: 

which is just: 

proving the lemma. 

Thus, the construction of Lemma 4.2 satisfies all four conditions of Lemma 4.1. 
For this construction we have that the quantity size(B k {g k )) of condition (3) is, as 
noted above, equal to k 2d . As just proven in Lemma 4.2.2, the polynomial T(k) in 
condition (4) is equal to: 

Thus, the construction of Lemma 4.2 produces an flQ-weak generator where: 

This completes the proof of Theorem 4. 

3.2 Fast Simulation of Probabilistic Computation 

We now explore a very much stronger notion of a one-way function. Namely, we consider 
functions which can be computed in polynomial time but whose inverse is difficult to 
compute with high probability by any circuits of some fixed sub- exponential size. We 
show that if any such function exists, it is possible to build a bit generator which 
can be used to simulate polynomial- time probabilistic computation in deterministic 
polynomial time. 

Definition Let I k be the set of all fc-bit strings, let D k C J k and let f k :D k »-► D k 
be a sequence of permutations. We will write D = \JD k and / = {/*}. Then, / 
is a 2 fc "-weak one-way function (or simply a 2*°-one-way function) if the following 
properties are satisfied: - 

(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input k, selects an x 6 D k with uniform probability. 
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(2) There exists a polynomial- time algorithm which, on inputs k and i € D k , computes 

(3) There exists a polynomial Q and a fixed constant a < 1 such that the following 
holds. If C = {Cjt} is a family of circuits where each C k has k inputs and size 
Poly[2 k ") then for all sufficiently large k: 

C t (i) 5^ f k \x) for at least a fraction — ^ of the i 6 #*• 

We now define an analagous next-bit test which corresponds to this new function. 
As might be expected, these next-bit tests will have sub-exponential size. Similarly, 
the analagous bit generator will pass all such next-bit tests. 
Definition Let a < 1 be a fixed constant, P be a polynomial, S k a multiset 
consisting of 2 fc °-long bit sequences and S = Ujfc S k . A 2 k " -next-bit test for S is^a 
family of circuits C = {C[}. Each circuit C\ has i Boolean inputs where t < 2 , 
one Boolean output and size Poly{2 k "). On input the first t bits of a sequence 8 
randomly selected from S k , C\ will output a bit b. Let pft denote the probability 
that b = the t + 1st bit of s. We say that S passes the test C if for all sufficiently 
large k and all t" < 2 fc ": 

We will refer to both C and C\ as 2 fc *-next-bit tests. 

Definition Let a < 1 be a fixed constant, I k the set of all strings of length k, 
and D k C I k a set of inputs (or "seeds") of length *. Let G be a deterministic 
algorithm which, on input a seed x G D k , outputs a 2*°-long bit sequence s x in 
Poly{2 k ") time. Let S k = {s x \x £ D k }. The algorithm G is a Cryptographtcally 
2 k " -strong pseudo-random bit generator (or simply a 2 k " -generator) if the multiset 
S = Ujfe S k passes all 2 k " -next-bit tests. 

Definition The class P denotes those languages which can be accepted deter- 
ministically in time polynomial in the length of the input. 
We now state the main theorem of this section: 
Theorem 5 If any 2 fca -one way function exists, then Z C P. 
Proof The proof will proceed as follows. First we show that if any 2 -one way 
function exists, then it is possible to build a 2 fc "-generator. Then we show that 
the sequences output by this generator can be used to simulate any probabilistic 
machine in deterministic polynomial time. First we give a set of conditions which 
is sufficient to build 2 fc "-generators. . 

Lemma 5.1 Let a < 1 be a fixed constant, let I k be the set of fc-bit strings and let 
D k C Ifc. Let g k :D k *-* D k be a sequence of permutations and let B k :D k •-► {0, 1} 
be a sequence of predicates. We will write D = Ujt D k ,g = {g k } and B = {B k }. If 
the following set of conditions hold, then it is possible to construct a 2 -generator. 
(1) The domain is accessible: there exists a probabilistic polynomiaLtime algorithm 
which, on input k, chooses x € D k with uniform probability. 
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(2) There exists a polynomial-time algorithm which, on inputs A; and x € D k , computes 

ffifc(4 

(3) There exists a polynomial-time algorithm which, on inputs k and x £ D k) computes 

(4) Let C = {C k } be any family of circuits such that each C k has k inputs and size 
Poly(2 k "). Let Q be any polynomial. Then, for all sufficiently large k: 

Cjfc(x) ^ Bjfc(x) for at least a fraction - - ^r of the x 6 D k . 

Proof of Lemma 5.1 First we construct the 2 fc °-CSB generator and then prove that 
its outputs must pass all 2 fe °-next-bit tests. The proof is very similar to that of Lemma 
4.1. 

Choose an appropriate value of k to be the seed length and choose in probabilistic 
polynomial time a random x 6 D k to be used as the seed. Set c = 2 k " , the desired 
length of the output sequence, and generate the bits: 

B k (g k (x)),Bk(gl(x)) A(rf(*)). 

The notation g[ indicates the j'-fold composition of g k . Now, output these bits in 

reverse order, i.e.: 

Bk{g e k(x)),B k (gr 1 (x)),...,B k (g k (x)). 

It should be clear that all of this can be accomplished in time Poly{2 k ") by conditions 
(2) and (3). 

It remains to show that the sequences output by this generator pass all 2 k -next-bit 
tests. Suppose that this is not true. Then there exists a polynomial Q and a family of 
Poly{2 k ") size circuits C = {C{} where each C\ has i < 2 k " inputs and the following 
holds. For each of infinitely many values of k there exists some i such that: 

c 1 _1_ 

Pk,i > 2 + Q(Q 

where p£ t - is the probability that the circuit C\ outputs the correct t + 1st bit of a 
sequence when given the first t bits as input. 

We now construct a family of circuits A={A k } where each A k has k inputs, size 
Poly(2 k ") and such that for infinitely many values of k, A k correctly computes the 
predicate B k {x) with probability at least \ + ^. This contradicts condition (4). 

Choose one of the infinitely many values of fc such that for some t < P{k): 

c 1 J_ 
Pi >*'- 2 + Q{k)' 
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The circuit A k uses C\ as a "subroutine". On input x € D k , the circuit A k first 
generates the t-bit sequence: 

B k (g\(x)),B k (g k - 1 (x)) t ...,B k (g k (x)) 

and inputs this sequence to the next-bit test circuit C\. This can be accomplished with 
0{i{Poly[k))) circuit gates since conditions (2) and (3) imply that B k [g k {x)) can be 
computed in polynomial size for all x 6 D k . Since t < 2 k " , this whole process takes 

size: 

0{2 k "{Poly{k))) = Poly{2 k ) 

The circuit A k then outputs whatever value C\ outputs on these bits making the total 

size of A k equal to: a 

Poly{2 k " +2 k ") = Poly{2 k ). 

The proof that the circuit A k solves the predicate B k with probability at least \ + ^y 
is given in Lemma 4.1 and so we omit it here. This completes the proof of the lemma. 
We now show that, given any 2 fc "-one-way function over an accessible domain E = 
Uit E k , it is possible to construct a new domain D = Ufc Dk, a function g = {fiffc} and 
a predicate B = {B k } which satisfy the four conditions of Lemma 5.1. Since the proof 
is virtually identical to that of Lemma 4.2, we give very few details. 
Lemma 512 Let / = {/*} be a 2 fc °-weak one-way function over an accessible domain 
E = l) k E k . Then, by definition, there exists some constant d such^that given any 
family of circuits C = {C k } where C k has k inputs and size Poly(2 k "), we have, for 
all sufficiently large k: 

C k (x) y£ /^(x) for at least a fraction jj of the x e Dk ' 

The following construction satisfies conditions (l)-(4) of lemma 5.1: 

(a) Set D k to be the cartesian product of k 2d copies of E k . It will be less cumbersome 
to write ck instead of k' zd so we will generally choose to do this, as in Chapter 2. 
Thus, D k is the cartesian product of ck copies of E k where c = fc 2d_1 . Formally: 

D k = {[xi,x 2 ,...,x ck ) | xi €E k ,...,x ek EE k }. 

(b) Let g k { (xi, x 2 , • • • , x ck ) ) = ( Ufa), f k {x 2 ), . . . , h{x ck ) ) where each xy G E k . 
(cl) Let B { k {x) = the t'th bit of f k \x) where x £ E k . 

(c2) Let 

k e 
B k { (ll, X2, . • • , Xck) ) = B\{x e{ i. i)+ j) 

,-=iy=i 
where each xy € D k and "©" denotes the "exclusive- or" operation. 
Proof of Lemma 5.2 It should be fairly obvious that conditions (1), (2) and (3) are 
satisfied by this construction (for more details, see the proof of Lemma 4.2). It remains 
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to show that for any family of circuits C = {C k } where each C k takes inputs from the 
domain D k and has size Poly(2 k ") , we get for all sufficiently large fc: 

1 1 

FT0b[C k {{xi,X2,...,X ek )) = B k {{x h X2,...,X ek ))} < 2 + Q(Jt) 

where (xi,X2,...,x cfc ) is a randomly chosen element of D k . We do this through a 
sequence of two further lemmas. 

Lemma 5.2.1 Let C = {C k } be a family of circuits where each circuit C k takes inputs 
from the domain D k and has size Poly[2 k "). Suppose that there exists a polynomial 
V[k) and a constant d such that for all sufficiently large fc we have, for a randomly 
chosen x £ E k : 



Prob 



s?Jx) > - + rrrrr for every value of i from 1 to fc 



V(k) 



>-l-b- 



Then, there exists a family of circuits A = {A k } where each A k has fc Boolean inputs, 
size Poly[2 k "), and for all sufficiently large fc: 

A k (x) = fl \x) for at least a fraction 1 - 73 of the x G E k . 

Proof The main details of the proof are identical to the proof of Lemma 1.2.2 
in Chapter 2. The only remaining step necessary here is to show that, under the 
assumptions of this lemma, Algorithm A2 on page ? can be converted into a circuit A k 
of size Poly{2 k "). As we saw in Lemma 4.2, where Algorithm A2 is reproduced, this 
algorithm can be converted into a circuit A k of size: 

0{Poly{k){size{C k ))) 
where size[C k ) is the size of the circuit C k . Since, by assumption, we have: 

8 ize{C k ) = Poly{2 k ") 
we know that A k has total size: 

0{Poly{k)Poly{2 k ")) = Pofy(2 fc< ") 

This finishes the proof of the lemma. Since f k is assumed to be a 2 fc< "-one-way function 
we have actually shown: 

Corollary 5.2.1 Let C = {C k } be any family of circuits where each C k takes inputs 
from the domain D k and has size Poly{2 k "). Let V(k) be any polynomial. Then for all 
sufficiently large fc we have, for a randomly chosen x 6 E k : 



Prob 



$V.(x) > - H r-7 for every value of : from 1 to fc 



— F 
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We need one more lemma to complete the proof of Lemma 5.2. 
Lemma 5.2.2 Let C = {C k } be any family of circuits where each C k takes inputs 
from the domain D k and has size Poly{2 k °). Let Q{k) be any polynomial. Then, for a 
randomly chosen (n, x 2 , . . . , x ek ) 6 D k we have, for all sufficiently large A:: 

1 1 

PTob[C k {{x ll x 2) ...,x ck )) = B k {[xi,X2,...,x ek ))] < 2 + Q[kj' 

Proof Let V{k) be any polynomial. Then, from Corollary 5.2.1, for all sufficiently 
large fc we have, for a randomly chosen x£E k : 



Prob 



s^l(x) < - H 7-t for some value of t from 1 to A; 



2 Q(k) 



1 



Then, from Lemma 1.2.3, there exists a polynomial Q(k) = aR[k) for some constant 
a such that, for a random (ii, x 2 , • • • , x ek ) E D k : 

1 1 

PTob[C k {{x l ,x 2l . . . ,x ek )) = B k [{x lt X2, . . . ,x ek ))\ < 2 + Q(Jt)- 

This completes the proof of both Lemma 5.2.2 and Lemma 5.2. 

We now continue the proof of Theorem 5 by showing that the 2* a -generator which we 

have just constructed also passes all statistical tests of a type similar to those denned 

in Chapter 2. 

Definition Let a < 1 be a fixed constant, S k be a multiset consisting of 2 -long 

bit sequences and S = U k S k .A 2*° '- statistical test for strings is a family of circuits 

C = {C k }. Each circuit C k has 2 k " Boolean inputs, one Boolean output and size 

Poly{2 k ). The multiset S passes the test C if for every polynomial Q, and all sufficiently 

large k: 

lrf-rfl< ' 



where pf denotes the probability that C k outputs 1 on a randomly selected s € S k 

and p% denotes the probability that C k outputs 1 on a randomly selected P(A;)-long 

bit sequence. We will refer to both C and C k as statistical tests. 

We can now show that 2 fc "-generators pass all 2 k " -statistical tests. 

Lemma 5.3 Let a < 1 be a fixed constant, S k a multiset consisting of 2 fc °-long bit 

sequences and S = Ujfc Sk- If the set S passes all 2 fc "-next-bit tests then S passes all 

2*°-statistical tests. 

Proof The proof is very similar to the proof of Theorem 2 so many details will be 

omitted. So suppose that a multiset S = [j k S k fails some 2 k " -statistical test. Then, 

by definition, there is a family circuits C = {C k } such that C k has 2* inputs and 

size Pofy(2* a ). Furthermore, there exists a polynomial Q such that for infinitely many 
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values of k: 

■ b?-rfl> 



<m 



where pf is the probability that Cjt outputs 1 on a randomly selected sequence in 
Sk and p? is the probability that C/t outputs 1 on a randomly selected 2" -long bit 
sequence. We now show that S must necessarily fail a 2 -next-bit test. 

Let [x\i]y denote the concatenation of the first i bits of x with y. For each i < 2* , 
let p\ denote the probability that Cjfc([x|i]y) = 1 where x is chosen randomly from S* 
and y is a random string of 2 k " — i bits. Thus, p\ is the probability that the statistical 
test Cjt outputs 1 when given as input the first t bits of some string in Sk followed by 
2 k " — i random bits. Note that pjt = pj? and p\ = p\ . 

We now construct a family of circuits A == {Aj.} which constitute a next-bit test 
for 5. Each A\ has i < 2 k " inputs and size Poly(2 k ). Let k be chosen such that: 

lrf-rfl> 



Assume, without loss of generality, that 

Then, clearly, there must be a value of i such that 

(Remember, i varies between and 2*".) The circuit A\ will correctly predict this . 
i + 1st bit of any sequence in Sk with high probability. 

Recall that p\ +1 is the probability that C* outputs 1 on any string consisting of 
ani+ 1-bit initial segment of a string in Sk followed by 2 k — (t + 1) random bits. Let 
q\ +1 be the probability that Ck outputs 1 when given the first t bits of a string in Sk 
followed by the incorrect i + 1st bit, followed by 2 k " — (i + 1) random bits. Note that: 

Pi=2 Pib + 2 9 * 
thus: 



48 



so: 



Consider in particular line (1). The fact that the difference between p\ and q\ is 
large means that if C k is given as input the first t bits of some string 8 € S k followed 
by 2 k ° - % random bits, it can effectively distinguish between those strings contamg 
the correct t + 1st bit of 8 in their i + 1st position, and those containing the incorrect 
i + 1st bit. 

Since p\ +1 + 4 +1 = 1 and p\ +1 > j k +1 , if we can be sure of correctly predicting 
the i + 1st bit of a randomly drawn string in S k with probability p\ +l , then we will be 
correct more than half the time. The next-bit test A\ is constructed to do this. 

So, suppose A\ is given as input b u b 2 , ..., b it the first i bits of a randomly chosen 
sequence s € S k . Let b be the correct t + 1st bit of * and let b be the incorrect t + 1st 
bit. The circuit A\ first calculates the probabilities: 

r e = Prob[ Cjfc(fei, . . . , fy, b, b i+2 , ..-, &2*" ) = * ] 



and 



n — Prob 



Cfc(6i, • • • , b if 5, 6, +2 . • • • i b 2**) = * 



where b i+2 ,.. .,^* B are chosen randomly. In fact, these probabilities can only be 
estimated with high accuracy but for the sake of clarity, let us assume for a moment 
that they can be calculated exactly. A discussion on estimating the probabilities 
appears later in the proof. Note that A\ does not actually know which is the correct 
and which the incorrect t + 1st bit. It simply calculates the two probabilities-one of 
them will be r c the other, r,-. The predicting circuit A\ now chooses b i+1 = b with 
probability ^ and chooses b i+i = I with probability ^. 

Suppose we calculate the value of r e for each t'-bit prefix of a string in S k and take 
the average of all these probabilities. This average is just the total probability with 
which the circuit A\ correctly predicts the t + 1st bit of a randomly drawn string in S k . 
This average is also equal to p\ +1 so A\ predicts correctly with probability p\ > j. 
Thus there must exist some polynomial R such that A\ correctly chooses the t + 1st 
bit for sequences in S k with probability at least J + ^. The set S therefore fails the 
next-bit test A = {A\} and this part of the proof is finished. 

Estimating the probabilities r e and r,- is done using the same sampling procedure 
described in the proof of Theorem 2. However, here we have from line (1) above that: 

Pk qk ~ Q(fc)2 fca • 

Thus, we would like our estimates to be off by less than l/(2 fc ° Q{k)) and, for reasons 
which will become clear below, we want to be confident that these estimates are off by 
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this amount with probability: 

To get the number of trials t we therefore solve the following equation where c = 
l/(2* a Q(Jfc))and6 = l/2 2fc °: 

In 2 - In c 



t> 



2c 2 
1 



ln2-ln(2- 2k )(2 2fc °(Q(fc)) 2 ) 



> 2L 
This is certainly satisfied by: 

t>2 2ka (Q(k)) 2 + 2* k "(Q(k)f. 

We will now show that if the number of trials t satisfies this inequality then the 
next-bit circuit can actually be built in such away that, given any x'-bit prefix of a string 
in Sk as input, the circuit can, with certainty, estimate r c and r t - with an error of less than 
l/(2 fc ° Q(k)). The argument is probabilistic. Suppose that t = 2 2fc "(<2(fc)) 2 + 2 3fc [Q{k)) 2 
and let W = {wi, . . .,w t } be a set of random strings each of length 2* - (i + 1). In 
other words each element w,- of W is a string of random bits: 

W% = &i+2A+3»---»&2*"- 

These strings will be used to do the sampling. Let x be a randomly chosen t-bit prefix 
of a string in S k . We will say that the set W fails on x if, given x as input, at least one 
of the probabilities r e and r t - cannot be estimated with an error of less than 1/(2 Q(k)) 
when using all of the strings in W to do the sampling. Since the chance of failing on 
either r c or ry is less than 1/2 2 (because c == 1/2 2 ), we get: 

Prob[W fails on a random t-bit prefix of a string in Sk] < 

Since there are at most 2* i-bit prefixes: 

2 
Prob[W fails on at least one t-bit prefix of a string in Sk] < —^C 2 *)- 

Since the next-bit circuit is trying to predict the i + 1st bit and strings in Sk have 
length 2*", we know that i can be at most 2*" — 1. Therefore, 

Prob[W fails on at least one t-bit prefix of a string in Sk] < 1- 

So, there must exist at least one set W of t strings which does not fail on any inputs to 
the next-bit circuit. This set W will be "built into" the next-bit circuit, assuring that 
this circuit can always estimate the probabilities r e and r t - with sufficient accuracy to 
be sure of predicting correctly more than half the time. Since t is polynomial in k, the 
next-bit circuit will have size polynomial in k as required. A complete algorithm for 
computing the next-bit is given on the next page. 



2 

2 2»" 



50 



Algorithm A3 

input: an i-bit prefix of a string x € 5* given as b\, . . . , &,• 

output: y = i + 1st bit of x (with probability \ + ^ for some polynomial R{k)) 

(1) Setcotmt = 

(2) Set counh = 

(3) Set t = 2 2fc °(Q(fc)) 2 + 2 3k "(Q(k)) 2 n 

(4) For sompic = 1 to t do 

(4.1) Choose next "built-in" input for sampling: 

(4.2) Set count = count + Ck{h,\..,b i ,0,bi + 2,--->h k '' ) 

(4.3) Set cotinii = counti + Cjt( fei, . . . , 6,-, 1, fe»+2> • • • » ^2 fc0 " ) 

(5) Set 90 = ^ 

(6) Set qi = ^1 

(7) Output y = with probability -^j- 
Output y = 1 with probability J{ gl 

It should be clear that this algorithm can be built into a next-bit circuit of size: 

0{t{size{C k )) 

where size[Ck) is the size of the circuit Cfc. By assumption we have: 

size{C k ) = Poly{2 k ") 

and from step (3) we see that: 

t = Poly{2 k '). 

Thus, we can build a next-bit circuit of size Poly{2 k ") which succeeds with probability 

at least \ + ^ for some polynomial R{k). This completes the proof of Lemma 5.3. 

Since, by definition, any 2*°-generator passes all 2 l, "-next-bit tests, we have actually 

shown: 

Corollary 5.3 Let G be a 2 fc0 -generator whose output consists of the set S = Ufc Sk 

as defined above. Then S passes all 2 fc °-statistical tests. 

We are now finally ready to show how 2 fc °-generators can be used to simulate 

probabilistic computation deterministically in polynomial time. 

Lemma 5.4 Suppose that a 2 k " -generator exists. Then, R.C.P. 
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o fLtrcPbe accepted by a probabilistic machine Mr. Suppose that Mr 
f^l t Lnl wh eTrftTe Jput Lgth. We construct a deterministic machrne 
M wMchaccept* L in polynomial time. On input x, machine M emulate. M«(*,v) 
Ig ^erTposle ouput'from the generator of length n> as * P°-"- * 
The machine M accepts z if and only if some output « from the generator causes 
Ml 1) taccept. By assumption, the generator "stretches" M»t -". 2 . "^ 
ludo-random sequence.. Thus, in order to get all pseudo-random sequences of ength 
*r o ttput by the generator, the deterministic machine M tr.es all mpnts to the 

generator of length , 

(jlogn) 1 ' . 

Also the generator can output each sequence of length v? in time Poly{n?) and the 
to requL for each deterministic simulation of the probable machine Mr is 
PolylV). The total time required for the entire simulation is therefore. 

PoI»(n')(2tf ^-I 1 '') = Poly(nO(» (1/aM,ll/ ") 
= Po.y(n). 

Thus, the machine M* can be simulated deterministically by M in polynomial time. 

Note that in our original definition of ^generators Qf™*^£Z$ 
™lv that each seed could be found in ■probabilistic polynomial time. Here, where we 
"iVto Le he gene ator for a deterministic algorithm, we simply try every string o 
Wth (/log nW f as an input to the generator. Some of these might not be actual seeds 
for the genera or, hence the resulting output might not be a pseudo-random sequenc . 
However' this method will cover every possible seed which is all that is necessary here, 
It remains to show that the machine M accepts the same language L which „ 
accepted by M R . As usual, let I k be the set of inputs to the generator of length k 
Td let 5, == {l|x 6 h}. We will show that if M does not accept L, then the set 
5 = Ufc S k output by the generator fails a 2* -statistical test. 

Recall that if x € L then for a randomly chosen string y, Af*(x,y) i accepts with 
probabiS iff x 1 1 then for no string y does M*(«, y) to accept. So if M does not 
P a epl language L, there must be some , € L which M incor^cUy reje «t.j£ 
none of the possible witnesses w output by the generator cause M ^ w } ^^l 
even though z € L.) Furthermore, there must exist infinitely many such strings , € L 
or else M could be repaired without increasing its running time. 

Let , be any element of L which M fails to accept. Recall that the probabilistic 
machine M R rTns n Ume n' where „ is input length. So, on input ., the determine tje 
Tachine a/wIII simulate M R ( Z ,y) on all strings V^^E^^^^ 
generator. Thus M will have to try all seeds of length (j log( \z\)) ' sine e th ge ne ator 
"stretches" lb-bit inputs into 2*"-b»t outputs. For the remainder of the proof, we will 
write k in place of (j log(|z|)) 1/a - . 

For each of the infinitely many such strings ,, there will be a circuit C k with \z\ 
inputs which has , "hard-wired" in. On any input y, the circuit C k -^«j 
This circuit outputs 1 if and only if M R (z,y) acepts. Since M R [z,y) runs in Poly[\z\) 
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time, the circuit C k has size Poly(\z\). Since: 

fc = (;log(|z|)) 1 / a 

the circuit has size Poly[2 ). 

Since j/Glwe have, by the definition of Z, for a randomly chosen string y of length 



1*1': 



pjf = Prob{C fc (y) = 1} = J. 



2 

However, since M/j(z,y) does not accept for any y € S k , for a randomly chosen y £ S k 

we have: 

p£ = Prob^y) = 1} = 0. 

DC 1 

Thus Pt - Pfc = 2 so there k certainlv a polynomial R such that p fc - p k > ^y. 
Therefore, the set S fails the 2 k " -statistical test {C k } contradicting our assumption 
that S is the output of a 2 k " -generator. This concludes the proof of both Lemma 5.4 
and Theorem 5. 
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Appendix 

In this appendix, we find a bound for t which satisfies the following inequality: 

2(1 - #(*)) < c 

where 

z — 2cy/t 

and t* 1 2, 

J -°° yfa 
First, we solve the following for z: 

Rearranging and using the fact that !?„ -fee-"'/ 2 du = y 7 ^ gives: 

There is, unfortunately, no closed form solution to the integral on the left side. We 
can, however, bound the left side from above by something which is easy to integrate. 
Assuming that z is greater than or equal to 1, the value of z obtained from solving the 
following inequality will certainly satisfy the original inequality. We will address the 
fact that z must be greater than or equal to 1 below. 

Jz £ 

Integrating gives: 

* 2 > 2 (In 2- In e). 

Now, recall that z = 2cy/i which gives: 

In 2 — In e /« \ 

4 > -i3-- (1) 

In order to bound our original integral, we had to assume that z be greater than or 
equal to 1. Since z = 1c\ft we get: 

2cy/i > 1 
which means: i 

**zr (2) 

Now, since < c < 1 we can satisfy both inequalities (1) and (2) by choosing t so 

that: . _ . 

In 2 - In c 
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